jwcolby
jwcolby at colbyconsulting.com
Wed Nov 14 14:41:46 CST 2007
Precisely. I will most likely run this in VB.Net since it has to eventually
process 50 million records.
I long ago generated a hash field for the address, family (address+lastName)
and Person (Address+Family+FirstName) so I have hash fields to allow me to
find the "same person", ignoring such problems as John Colby and John W.
Colby at the same address.
This project is one of the reasons I was asking the question about iterating
the fields of a class in .Net. If I build a class which is instantiated
once for each record for a given person, I can then update the first
instance using the data in the subsequent instances, and when done, write
the first instance back to the table (or a new table).
If I do it correctly the classes of the extraneous records can be told to
delete their record in the table after each "person" recordset is scrubbed.
I can put a system like this on autopilot to run over a week or month,
however long it might take. And with 50 million records it is going to take
awhile. But with a VB.Net program running in the background scrubbing the
table, I can continue to use the table, with increasing accuracy as the
table is being scrubbed.
None of the other providers for my client has ever attempted to do this, for
obvious reasons (it ain't easy!).
John W. Colby
Colby Consulting
www.ColbyConsulting.com
-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of A.D.TEJPAL
Sent: Wednesday, November 14, 2007 2:50 PM
To: Access Developers discussion and problem solving
Cc: A.D.TEJPAL
Subject: Re: [AccessD] merging records
John,
Apparently, each column carries only one significant value (over a group
of records) for each combination of person & address. Your objective is to
display only one compacted row per combination of person & address, showing
only the significant values for survey results in various columns across the
record.
As a programmatic solution, the following course of action is suggested:
1 - Let the source table be named T_Data. Its first four fields are ID
(PK), FirstName, LastName and Address, followed by large number of other
fields (like Smokes etc) meant to hold survey response.
2 - Create an empty table named T_Result. Its structure should be
identical to that of T_Data.
3 - Create a dummy table T_Dummy having one field. Populate it with one
record.
Having taken the above steps, if you run sample subroutine
P_PopulateResultTable as given below, table T_Result will get populated with
the compacted survey results in desired format. You might like to try it out
and confirm whether it is in line with what you have been aiming at.
Note - It has been tested on Access 2003 desktop (Access 2000 file
format). Reference required - DAO 3.6
Best wishes,
A.D.Tejpal
------------
Sample subroutine - for merging survey results T_data is source table.
Results are appended to T_Result.
'=====================================
Sub P_PopulateResultTable()
' This subroutine merges the survey
' results for each person in source table
' T_Data and appends the outcome into
' destination table T_Result. Structure of T_Result
' is identical to that of T_Data
' T_Dummy is a single field single record table.
Dim Qst As String, Txt As String
Dim Fnm As String, Qst2 As String
Dim Fv As Variant
Dim rst1 As DAO.Recordset
Dim rst2 As DAO.Recordset
Dim fd As Field
Dim tdf As TableDef
Dim db As DAO.Database
Const SourceTable As String = "T_Data"
Const DestnTable As String = "T_Result"
Const DummyTable As String = "T_Dummy"
' Comma separated string of all field names
' that do not directly carry survey response
Const ExemptFields As String = _
"ID,FirstName,LastName,Address"
Set db = DBEngine(0)(0)
' Clear destination table
db.Execute "DELETE * FROM " & _
DestnTable & ";", dbFailOnError
Qst = "SELECT FirstName, LastName, " & _
"Address FROM " & SourceTable & _
" GROUP BY FirstName, " & _
"LastName, Address;"
Set rst1 = db.OpenRecordset(Qst)
Set tdf = db.TableDefs(SourceTable)
Do Until rst1.EOF
Qst = "INSERT INTO " & DestnTable & _
" SELECT '" & _
rst1.Fields("FirstName") & "' AS " & _
"FirstName, '" & rst1.Fields("LastName") & _
"' AS LastName, '" & rst1.Fields("Address") & _
"' AS Address,"
For Each fd In tdf.Fields
Fnm = fd.Name
If InStr(ExemptFields, Fnm) > 0 Then
Else
Qst2 = "SELECT " & Fnm & _
" FROM " & SourceTable & _
" WHERE FirstName = '" & _
rst1.Fields("FirstName") & _
"' AND LastName = '" & _
rst1.Fields("LastName") & _
"' AND Address = '" & _
rst1.Fields("Address") & _
"' AND Len(" & Fnm & ") > 0;"
Set rst2 = db.OpenRecordset(Qst2)
If rst2.RecordCount > 0 Then
Qst = Qst & " '" & rst2.Fields(0) & _
"' AS " & Fnm & ","
Else
Qst = Qst & " Null AS " & Fnm & ","
End If
End If
Next
' Remove trailing comma
Qst = Left(Qst, Len(Qst) - 1)
Qst = Qst & " FROM " & DummyTable & ";"
' Append to destination table
db.Execute Qst, dbFailOnError
rst1.MoveNext
Loop
rst1.Close
rst2.Close
Set rst1 = Nothing
Set rst2 = Nothing
Set fd = Nothing
Set tdf = Nothing
Set db = Nothing
End Sub
'=====================================
----- Original Message -----
From: jwcolby
To: 'Access Developers discussion and problem solving'
Sent: Wednesday, November 14, 2007 19:09
Subject: Re: [AccessD] merging records
A.D.,
Thanks for the response. Unfortunately it is not that simple, i.e. there
are about 700 fields, of which about 600 are responses to query questions.
Each of those 600 fields will need to be merged with the alternate record.
for example:
FName LName Addr Smokes Softdrink Car
John Colby 1723 N '' ''
John Colby 1723 '' Pepsi ''
John Colby 1723 '' '' Ford Escort
In at least one table there are 600 fields. The fields are divided into
"sets" of fields. One set is about boats - State registered, length,
type,
engine etc. Another set is about medications taken - Zoloft, Aspirin,
Etc.
Another set is about electronics purchased - stereo, cb, computers, cell
phones etc.
<< SNIPPED to prevent overall size crossing limits >>
John W. Colby
Colby Consulting
www.ColbyConsulting.com
--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com