Rocky Smolin
rockysmolin at bchacc.com
Thu Jan 14 11:00:58 CST 2010
Just spoke with the client and it turns out that new imports will almost certainly contain records already processed. He might do an import for a month and then later decide to add some tests or patients and so the extracted file would contain both the old and the new patients and/or tests. So the import to a temp table and run unmatched query to find out which records to import looks like it might be the fastest way. R -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Max Wanadoo Sent: Thursday, January 14, 2010 7:57 AM To: 'Access Developers discussion and problem solving' Subject: Re: [AccessD] Fastest Way If they only send it through monthly, append the date to the name. once processed move it to a "<dir>\Processed" sub folder for archiving. Until it is processed it sits there and stops other files being added. (you have to code this check). Max -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Rocky Smolin Sent: 14 January 2010 15:39 To: 'Access Developers discussion and problem solving' Subject: Re: [AccessD] Fastest Way Almost foolproof until they rename the file. The controls in this environment are a bit shaky. I suppose I could put in a 'find duplicate tests' function and a 'delete duplicate tests' function that could run from time to time. R -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Max Wanadoo Sent: Thursday, January 14, 2010 7:25 AM To: 'Access Developers discussion and problem solving' Subject: Re: [AccessD] Fastest Way R, I do similar things for income files (4 per week) and I simply use the s/sheet input name. Once it has been input, the filename is retained on a field in the table as the "source" and it is easy to check if that has already been processed. If your files all have the same name,it is easy to append the year/month to it. Max -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Rocky Smolin Sent: 14 January 2010 15:01 To: 'Access Developers discussion and problem solving' Subject: [AccessD] Fastest Way Dear List: I have a medical database into which lab tests are imported from a spreadsheet (exported from a mainframe medical records database at a big hospital). Front and back ends are Access 2003. The table tblPatientTest has (in addition to the autonumber PK) 5 fields: Patient ID (FK) Test ID (FK) Test Date as Date/Time (indexed duplicates OK) Test Time as long integer - value between 0000 and 2399 (indexed duplicates OK) Test Result - Text (indexed duplicates OK) I'm using DAO to add the records to the table. Imports will happen monthly and the number of records will be on the order of 10,000-100,000 with mean probably around 20k-40k. As the tests are imported I want to check the table to see if the test record is already present by finding a record which matches all 5 fields to prevent duplicate test records from being entered. Since the import will take place monthly, import time is not super critical but before importing the user will run an Error Report which does all the error checking that the import does without actually doing the import. The user then resolves the errors and when the Error report is clean, will run the import. So the error report may have to be run more than once to clean the data. It is possible that the user will try to import a spreadsheet that has already been imported. So it's important to screen for duplicates. So I want to find the fastest method of checking for duplicate. The obvious thing is rs.FindFirst on all five fields, but I'm thinking that may take a really long time. Another possibility would be to create a temporary table in the front end or external mdb, and create indexed records by concatenating all five fields and checking for dups on that table. Probably faster than the first method but creating the table could take a lot of time when the Patient Test table has a few hundred thousand records in it. Or I could add the field of concatenated fields to the table itself as the records are added, indexed and maintain that. Or some other method that I haven't thought of or don't know about. Any advice welcome. MTIA Rocky Smolin Beach Access Software 858-259-4334 www.e-z-mrp.com <http://www.e-z-mrp.com/> www.bchacc.com <http://www.bchacc.com/> -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com