jwcolby
jwcolby at colbyconsulting.com
Wed Nov 14 14:41:46 CST 2007
Precisely. I will most likely run this in VB.Net since it has to eventually process 50 million records. I long ago generated a hash field for the address, family (address+lastName) and Person (Address+Family+FirstName) so I have hash fields to allow me to find the "same person", ignoring such problems as John Colby and John W. Colby at the same address. This project is one of the reasons I was asking the question about iterating the fields of a class in .Net. If I build a class which is instantiated once for each record for a given person, I can then update the first instance using the data in the subsequent instances, and when done, write the first instance back to the table (or a new table). If I do it correctly the classes of the extraneous records can be told to delete their record in the table after each "person" recordset is scrubbed. I can put a system like this on autopilot to run over a week or month, however long it might take. And with 50 million records it is going to take awhile. But with a VB.Net program running in the background scrubbing the table, I can continue to use the table, with increasing accuracy as the table is being scrubbed. None of the other providers for my client has ever attempted to do this, for obvious reasons (it ain't easy!). John W. Colby Colby Consulting www.ColbyConsulting.com -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of A.D.TEJPAL Sent: Wednesday, November 14, 2007 2:50 PM To: Access Developers discussion and problem solving Cc: A.D.TEJPAL Subject: Re: [AccessD] merging records John, Apparently, each column carries only one significant value (over a group of records) for each combination of person & address. Your objective is to display only one compacted row per combination of person & address, showing only the significant values for survey results in various columns across the record. As a programmatic solution, the following course of action is suggested: 1 - Let the source table be named T_Data. Its first four fields are ID (PK), FirstName, LastName and Address, followed by large number of other fields (like Smokes etc) meant to hold survey response. 2 - Create an empty table named T_Result. Its structure should be identical to that of T_Data. 3 - Create a dummy table T_Dummy having one field. Populate it with one record. Having taken the above steps, if you run sample subroutine P_PopulateResultTable as given below, table T_Result will get populated with the compacted survey results in desired format. You might like to try it out and confirm whether it is in line with what you have been aiming at. Note - It has been tested on Access 2003 desktop (Access 2000 file format). Reference required - DAO 3.6 Best wishes, A.D.Tejpal ------------ Sample subroutine - for merging survey results T_data is source table. Results are appended to T_Result. '===================================== Sub P_PopulateResultTable() ' This subroutine merges the survey ' results for each person in source table ' T_Data and appends the outcome into ' destination table T_Result. Structure of T_Result ' is identical to that of T_Data ' T_Dummy is a single field single record table. Dim Qst As String, Txt As String Dim Fnm As String, Qst2 As String Dim Fv As Variant Dim rst1 As DAO.Recordset Dim rst2 As DAO.Recordset Dim fd As Field Dim tdf As TableDef Dim db As DAO.Database Const SourceTable As String = "T_Data" Const DestnTable As String = "T_Result" Const DummyTable As String = "T_Dummy" ' Comma separated string of all field names ' that do not directly carry survey response Const ExemptFields As String = _ "ID,FirstName,LastName,Address" Set db = DBEngine(0)(0) ' Clear destination table db.Execute "DELETE * FROM " & _ DestnTable & ";", dbFailOnError Qst = "SELECT FirstName, LastName, " & _ "Address FROM " & SourceTable & _ " GROUP BY FirstName, " & _ "LastName, Address;" Set rst1 = db.OpenRecordset(Qst) Set tdf = db.TableDefs(SourceTable) Do Until rst1.EOF Qst = "INSERT INTO " & DestnTable & _ " SELECT '" & _ rst1.Fields("FirstName") & "' AS " & _ "FirstName, '" & rst1.Fields("LastName") & _ "' AS LastName, '" & rst1.Fields("Address") & _ "' AS Address," For Each fd In tdf.Fields Fnm = fd.Name If InStr(ExemptFields, Fnm) > 0 Then Else Qst2 = "SELECT " & Fnm & _ " FROM " & SourceTable & _ " WHERE FirstName = '" & _ rst1.Fields("FirstName") & _ "' AND LastName = '" & _ rst1.Fields("LastName") & _ "' AND Address = '" & _ rst1.Fields("Address") & _ "' AND Len(" & Fnm & ") > 0;" Set rst2 = db.OpenRecordset(Qst2) If rst2.RecordCount > 0 Then Qst = Qst & " '" & rst2.Fields(0) & _ "' AS " & Fnm & "," Else Qst = Qst & " Null AS " & Fnm & "," End If End If Next ' Remove trailing comma Qst = Left(Qst, Len(Qst) - 1) Qst = Qst & " FROM " & DummyTable & ";" ' Append to destination table db.Execute Qst, dbFailOnError rst1.MoveNext Loop rst1.Close rst2.Close Set rst1 = Nothing Set rst2 = Nothing Set fd = Nothing Set tdf = Nothing Set db = Nothing End Sub '===================================== ----- Original Message ----- From: jwcolby To: 'Access Developers discussion and problem solving' Sent: Wednesday, November 14, 2007 19:09 Subject: Re: [AccessD] merging records A.D., Thanks for the response. Unfortunately it is not that simple, i.e. there are about 700 fields, of which about 600 are responses to query questions. Each of those 600 fields will need to be merged with the alternate record. for example: FName LName Addr Smokes Softdrink Car John Colby 1723 N '' '' John Colby 1723 '' Pepsi '' John Colby 1723 '' '' Ford Escort In at least one table there are 600 fields. The fields are divided into "sets" of fields. One set is about boats - State registered, length, type, engine etc. Another set is about medications taken - Zoloft, Aspirin, Etc. Another set is about electronics purchased - stereo, cb, computers, cell phones etc. << SNIPPED to prevent overall size crossing limits >> John W. Colby Colby Consulting www.ColbyConsulting.com -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com