Steve Conklin
developer at ultradnt.com
Mon Jan 15 08:18:10 CST 2007
John: To proces this many records, and not have it running for days seems to me to need an SP, and use of a temp table (a real, MSSQL in-memory temp table, which is much faster than a writing to 'real' table). So, write an SP that opens a cursor on the unique person-identifier. Loop thru these, then on each, select the count of matching records; if more than 1, open another cursor and loop those that match. Write all the data in the matches to a single record in a temp table, delete all the matches from the main table, then append the temp record to the real table. My VBA is much better than my TSQL, so bear with this pseudo-code: Declare rs cursor for select unique person-identifier from main_table Fetch next from rs into @id Count how many records in main_table have person-identifier = @id If > 1 then inset into #tmp 1 record with person-identifier = @id, and all the rest of the fields Declare rs2 cursor for select * from main_table where person-identifier = @id for i= 1 to 650 if fld(i) not null update #tmp set fld(i)=some_value next I fetch next delete from main_table where person-identifier = @id insert #tmp record into main_table End if Hth Steve -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of JWColby Sent: Monday, January 15, 2007 8:52 AM To: 'Access Developers discussion and problem solving' Subject: Re: [AccessD] deduplication Michael, I am looking at normalizing the data. However in many cases the client has to determine the fields that belong in a "child" survey table. Sometimes you can just look at the field names and tell the groupings, sometimes not. And of course, visually LOOKING at the data is simply not an option when there are 50 million records and any given survey may only populate a million of those records. John W. Colby Colby Consulting www.ColbyConsulting.com -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Michael R Mattys Sent: Monday, January 15, 2007 7:46 AM To: Access Developers discussion and problem solving Subject: Re: [AccessD] deduplication ----- Original Message ----- From: "JWColby" <jwcolby at colbyconsulting.com> <snip> > Thus my job is to now identify all three of these "john Colby" records > and merge all of the "survey" fields back into a single record, > deleting the two now un-needed extra records in the process. If it > weren't for the fact that there are about 650 "survey" fields this > would not be such a big task. <snip> > It is specifically the ~650 fields that causes the issue for Access as > (I > believe) Access cannot handle that many fields at a time. <snip> I think I'd still look at how to normalize the database. Look at how to query the data such that an INSERT ...WHERE would create a distinct table for Survey N. Then write all 700 fields as the header to a csv Then use each survey-query to populate the one line across. Import into a new table when finished. Michael R. Mattys MapPoint & Access Dev www.mattysconsulting.com -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com