[AccessD] deleting duplicates

Thu Jun 25 13:24:45 CDT 2009

Jim,

    It is nice to note that your problem stands resolved. The method adopted by you  is convenient and safe.

    Another alternative approach could be as follows:

    Sample select query named Q_Show_DupsBeyondFirst, as given below, would show all duplicates beyond the first occurrence in each set. T_Data is the source table while fields F1 to F4 determine the test for duplicity. ID is the primary key field.

    Sample delete query named Q_DEL_DupsBeyondFirst, as given below, deletes all duplicates beyond the first occurrence in each set.

    Caution:
    --------
    (a) Comparison style involving combined field values e.g. F1 & F2 & F3 & F4 runs the risk of inconsistent results arising out of sliding match. For example, F1 = A and F2 = B would give the same result as F1 = AB and F2 as Null. Moreover, the contrived concatenation can detract from optimum performance.

    (b) It looks quite tempting to test for Not In (SELECT TOP 1 ---- etc). But here too, the ORDER BY clause involving fields F1 to F4 re-introduces the risk of sliding match mentioned in (a) above.

    (c) Even while using sequential count of duplicates as demonstrated in sample queries given below, it is important to use Nz() function while comparing the field values as shown. This is to prevent inconsistent count arising out of fields with Null value on either side of = operator.

Best wishes,
A.D. Tejpal
------------

Q_Show_DupsBeyondFirst
=====================================
SELECT T_Data.*  
FROM T_Data  
WHERE (SELECT  Count(*) FROM T_Data AS T  WHERE Nz(T.F1, "") =  Nz(T_Data.F1, "") AND Nz(T.F2, "") =  Nz(T_Data.F2, "") AND Nz(T.F3, "") =  Nz(T_Data.F3, "") AND Nz(T.F4, "") =  Nz(T_Data.F4, "")  AND T.ID <= T_Data.ID) > 1 
ORDER BY T_Data.F1, T_Data.F2, T_Data.F3, T_Data.F4, T_Data.ID;
=====================================

Q_DEL_DupsBeyondFirst
=====================================
DELETE *  
FROM T_Data  
WHERE (SELECT  Count(*) FROM T_Data AS T  WHERE Nz(T.F1, "") =  Nz(T_Data.F1, "") AND Nz(T.F2, "") =  Nz(T_Data.F2, "") AND Nz(T.F3, "") =  Nz(T_Data.F3, "") AND Nz(T.F4, "") =  Nz(T_Data.F4, "")  AND T.ID <= T_Data.ID) > 1;
=====================================

----- Original Message ----- 
From: Jim Lawrence 
To: 'Access Developers discussion and problem solving' 
Sent: Thursday, June 25, 2009 11:48
Subject: [AccessD] Deleting duplcates

Hi All:

Solved the problem by creating any identical empty table with a key/index
that would force unique records only and then appended the first table into
the second table. All duplicates were gone.

Jim

  ----- Original Message ----- 
  From: Jim Lawrence 
  To: 'Access Developers discussion and problem solving' 
  Sent: Thursday, June 25, 2009 10:18
  Subject: [AccessD] deleting duplicates

  Hi All:

  It is easy to list duplicates but what is needed to do is to list all
  duplicates greater than one, with only sql. 

  Possible?

  Jim