[AccessD] deduplication

Susan Harkins ssharkins at setel.com
Mon May 28 10:46:19 CDT 2007


John, I'd remove all but the outer parentheses to see what kind of error you
get -- will probably be more productive. 


WHERE (tblData1.HashPerson = tblData.HashPerson AND tblData1.PKID
<=TblData.ID > 1)


Susan H.  

-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of jwcolby
Sent: Monday, May 28, 2007 11:07 AM
To: 'Access Developers discussion and problem solving'
Subject: Re: [AccessD] deduplication

A.D. (or anyone willing to chime in)

I am attempting to implement your solution.  I created the following select
which should be the sub query in the outer WHERE:

SELECT count(*)
FROM tblData
AS tblData1
WHERE ((tblData1.HashPerson = tblData.HashPerson) AND (tblData1.PKID
<=TblData.ID)) > 1; 

However I am getting the very infamous Incorrect syntax near '>'.

Any thoughts?

John W. Colby
Colby Consulting
www.ColbyConsulting.com
-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of A.D.TEJPAL
Sent: Monday, January 15, 2007 12:05 AM
To: Access Developers discussion and problem solving
Subject: Re: [AccessD] deduplication

    Sample query as given below, would remove all duplicates (other than the
first occurrence for each case).

    T_Data is the name of table, while F1, F2 & F3 are the names of fields,
whose combined value determines whether a record is duplicate or not. ID is
the primary key (number type).

A.D.Tejpal
---------------

=====================================
DELETE * FROM T_Data
WHERE (SELECT Count(*) FROM T_Data As T1 WHERE (T1.F1 & T1.F2 & T1.F3 =
T_Data.F1 & T_Data.F2 & T_Data.F3) AND (T1.ID <= T_Data.ID)) > 1;
=====================================

  ----- Original Message -----
  From: JWColby
  To: 'Access Developers discussion and problem solving' ;
dba-sqlserver at databaseadvisors.com
  Sent: Monday, January 15, 2007 09:19
  Subject: [AccessD] deduplication


  Is there a SQL syntax or method for identifying duplicate records based on
and automatically removing (or marking) the second and subsequent instances
of that record.
   
  I am working with a database where there may be several records for the
same person.  Each record will have slightly different data in the record.
I need to find all the records for a given person, remove or somehow flag
the "duplicate" records, and eventually "merge" the duplicate records
together.
   
  I know how to use groupby and count to find the dupes but I don't really
know how to then merge fields F through M from records 2,3 and 4 back into
fields F through M in record 1.  This is going to be an ongoing project so I
need to learn the methodology to get it done.  I can do it with recordsets
and code, but not with SQL, and given the large potential numbers of records
SQL would be my first choice.
   
  John W. Colby
  Colby Consulting
  www.ColbyConsulting.com
--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com

--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com

Internal Virus Database is out-of-date.
Checked by AVG Free Edition. 
Version: 7.5.467 / Virus Database: 269.7.0/804 - Release Date: 5/14/2007
4:46 PM
 




More information about the AccessD mailing list