excel - removing duplicates using 2 columns -


i trying remove duplicate id's in sheet. example here few rows of data

id   |  department |  sales   | update date 1    | sales       | 100      |  2    | marketing   | 100      |  2    | marketing   | 200      | 30/06/2015 2    | marketing   | 300      | 05/07/2015 

i want remove duplicate id's base on update date column. want have following remaining:

id   |  department |  sales   | update date 1    | sales       | 100      |  2    | marketing   | 300      | 05/07/2015 

so checks latest update row of id , removes others.

any advice on using vba or macro great form part of automated script.

one way achieve want read rows , iterate through each duplicate row , find keep, delete based on finding highest update_date.

i have wrote macro that. here code:

first: create blank module in vba editor , paste following code:

public type row      id string     updated date      row_number integer    'to know rows delete later     is_duplicate boolean  'to mark if current row duplicate     to_keep boolean       'to decide whether keep or delete     verified boolean      'needed avoid evaluating rows same id  end type    sub removeduplicates()  range("a2").select    'go first row dim cnt integer    'keep record of how many rows cnt = 0               'begin empty array dim rows() row     'declared without count   '== step 1: read data , store in array =============== while activecell.value <> ""     cnt = cnt + 1     redim preserve rows(cnt)   'expand size of array 1     rows(cnt - 1).row_number = activecell.row   'keep record of current row address      if activecell.offset(1, 0).value = activecell.value or _        activecell.offset(-1, 0).value = activecell.value         'if cell above/below has samve id current cell, it's duplicates          rows(cnt - 1).is_duplicate = true     else         rows(cnt - 1).is_duplicate = false     end if      rows(cnt - 1).id = activecell.value                   'store id     rows(cnt - 1).updated = activecell.offset(0, 3).value 'store date     activecell.offset(1, 0).select                        'move next row below loop   '=== step 2: iterating throw array , deciding keep, delete ========= = 0 cnt - 1     if rows(i).is_duplicate , not rows(i).verified         'the current id duplicated, , of other rows same id has not been verified         find_to_keep rows, rows(i).id, cnt   'helper sub analyze each row      end if next   '==== step 3: iterating throw array delete ones marked delete ==========  = cnt - 1 0 step -1  'we have reverse order because deleted rows contain data other valid rows      if rows(i).is_duplicate , not rows(i).to_keep         'if current row duplicate , not marked (to keep) must deleted          dim r integer         r = rows(i).row_number   'get rows number (address) of row          range(r & ":" & r).entirerow.delete shift:=xlshiftup   'delete row , shift other rows below      end if  next  end sub  sub find_to_keep(byref rows() row, byval id string, byval cnt integer)     dim max_date date   'temparary variable hold maximum date      dim to_keep integer  'temporary variable hold location of row keep       ' -- step a: go throw array , find rows id specified in sub parameter     = 0 cnt - 1         if rows(i).id = id             'if row has date higher our current max_date, read date             if rows(i).updated > max_date                 max_date = rows(i).updated                 to_keep =             end if         end if     next        '-- step b: know row keep, need do:     '           1- mark other rows having same id verified (to avoid looping through them again)     '           2- mark row highest date (to_keep) = true      = 0 cnt - 1         if rows(i).id = id             if = to_keep                 rows(i).to_keep = true             else                 rows(i).to_keep = false             end if             rows(i).verified = true          end if     next  end sub 

here looks like: screenshot of excel result

and if like, have attached entire workbook reference: remove_duplicates.xlsm


Comments

Popular posts from this blog

android - org.xmlpull.v1.XmlPullParserException: expected: START_TAG {http://schemas.xmlsoap.org/soap/envelope/}Envelope -

OpenCV OpenCL: Convert Mat to Bitmap in JNI Layer for Android -

c# - Specify IP End Point in FiddlerCore -