[AccessD] Duplicates Query

Wed Sep 1 12:32:29 CDT 2010

Rocky,

    You could also try two other alternatives as per sample queries given below, for removing all duplicates beyond the first occurrence. F1 to F4 are the names of four fields, combined contents of which, determine the duplicate status. ID is the name of PK field.

    Query Q_DEL_DupsBeyondFirst_ByNotInTop1 is expected to be more efficient as compared to Q_DEL_DupsBeyondFirst_ByCount. (There would be only one record featuring in the Not In clause).

    Note: 
    (a) Individual comparison of fields, one at a time, has been preferred over using their concatenated contents, so as to eliminate the risk of inconsistent results arising out of potential sliding match. Moreover, contrived concatenation can detract from optimum performance.

    (b) Use of Nz() function while comparing the field values, eliminating the potential for erratic results arising out of Null values getting compared via = operator. 

Best wishes,
A.D. Tejpal
------------

Q_DEL_DupsBeyondFirst_ByNotInTop1
==========================================
DELETE *  
FROM T_Data  
WHERE T_Data.ID Not In (SELECT TOP 1 ID FROM T_Data AS T  WHERE Nz(T.F1, "") =  Nz(T_Data.F1, "") AND Nz(T.F2, "") =  Nz(T_Data.F2, "") AND Nz(T.F3, "") =  Nz(T_Data.F3, "") AND Nz(T.F4, "") =  Nz(T_Data.F4, "") ORDER BY  ID);
==========================================

Q_DEL_DupsBeyondFirst_ByCount
==========================================
DELETE *  
FROM T_Data  
WHERE (SELECT  Count(*) FROM T_Data AS T  WHERE Nz(T.F1, "") =  Nz(T_Data.F1, "") AND Nz(T.F2, "") =  Nz(T_Data.F2, "") AND Nz(T.F3, "") =  Nz(T_Data.F3, "") AND Nz(T.F4, "") =  Nz(T_Data.F4, "")  AND T.ID <= T_Data.ID) >  1;
==========================================

  ----- Original Message ----- 
  From: Gustav Brock 
  To: accessd at databaseadvisors.com 
  Sent: Wednesday, September 01, 2010 22:39
  Subject: Re: [AccessD] Duplicates Query

  Hi Rocky

  I believe you will have to group by these fields in a way to build a unique string, say:

    Group By [Field1] & "-" & [Field2] & "-" & [Field3]

  /gustav

  >>> rockysmolin at bchacc.com 01-09-2010 18:56 >>>
  Gustav:

  There are actually three fields that have to match up to make a duplicate.
  I suppose I could add another field which is the concatenation of those
  three but that seems kludgey.  Any other more elegant solution?

  TIA

  Rocky

  -----Original Message-----
  From: accessd-bounces at databaseadvisors.com 
  [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Gustav Brock
  Sent: Wednesday, September 01, 2010 8:49 AM
  To: accessd at databaseadvisors.com 
  Subject: Re: [AccessD] Duplicates Query

  Hi Rocky

  Use First or Last or Max or Min - just one that will return only one PK of
  the records where the field named "name" holds identical values.

  /gustav

  >>> rockysmolin at bchacc.com 01-09-2010 17:38 >>>
   Gustav:

  Looks right but it has step 2 as:

  2. Create a totals Query1 which shows GroupBy the name, and First of the PK.

  I assume totals query = summation query.  But "GroupBy the name" - what does
  'name' refer to?  And how to do First of the PK?

  TIA

  Rocky

  -----Original Message-----
  From: accessd-bounces at databaseadvisors.com 
  [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Gustav Brock
  Sent: Wednesday, September 01, 2010 12:16 AM
  To: accessd at databaseadvisors.com 
  Subject: Re: [AccessD] Duplicates Query

  Hi Rocky

  Yes. Look up a not so old thread:
  http://databaseadvisors.com/pipermail/accessd/2009-May/068576.html 

  /gustav

  >>> rockysmolin at bchacc.com 31-08-2010 18:42 >>>
  Dear List:

  I used the find duplicates query wizard to make a query to find duplicate
  records in a table.  Is there a quick way to then delete all but one
  occurrence of the duplicated records?

  MTIA 

  Rocky Smolin

  Beach Access Software

  858-259-4334