[AccessD] Fastest Way (to convert Excel files)

Thu Jan 14 11:58:36 CST 2010

I use docmd.transferspreadsheet AND docmd.transfertext (for csv) and have no
problems with them.

Some arrive as xls and others as  csv.

DoCmd.TransferSpreadsheet , , strTableNameTemp, strImportedSpreadSheet,
False

DoCmd.TransferText , , strTableName, strImportedSpreadSheet, True

Max

-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Rocky Smolin
Sent: 14 January 2010 17:33
To: 'Access Developers discussion and problem solving'
Subject: Re: [AccessD] Fastest Way (to convert Excel files)

Well I was thinking about just using TransferSpreadsheet although in the
past I have opened an excel file directly in code and processed the records
in one at a time.  I think that was because TransferSpreadsheet had some
stability problem or something.

R

-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Gustav Brock
Sent: Thursday, January 14, 2010 8:55 AM
To: accessd at databaseadvisors.com
Subject: Re: [AccessD] Fastest Way (to convert Excel files)

Hi Rocky

This tiny converter, ABC Amber Excel Converter, claims it will convert Excel
files directly to an mdb file - without Excel - and in batch mode from the
command line as well:

http://www.processtext.com/abcexcel.html 

or to csv files which you could bulk import in SQL Server. The registration
fee is quite fair it seems.

/gustav

>>> rockysmolin at bchacc.com 14-01-2010 17:27 >>>
Lambert:

That looks like a good approach.  Query on 2) would be interesting but I'm
not sure I would need that information if I use the query in 3) to process
just the incoming records that are not already present.  I'm a little
concerned about the time the Unmatched Query will take when there are
3-4,000,000 records in the table especially since the table in the back end
is on a server.  But it wouldn't take long to implement.

Thanks

Rocky

-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Heenan, Lambert
Sent: Thursday, January 14, 2010 7:42 AM
To: 'Access Developers discussion and problem solving'
Subject: Re: [AccessD] Fastest Way

Rocky,

I would suggest the following approach.

1/ Import all of the spreadsheet data to a temporary table.
2/ Create a query joining the import table to the table tblPatientTest using
all five fields. This query will list records already present in
tblPatientTest. For example (using my field names)...

SELECT tblImports.Policy_ID, tblImports.Seller_ID,
tblImports.dSubmissionReceived, tblImports.dAcknowlegementSent,
tblImports.PolicyType FROM tblPatientTest INNER JOIN tblImports ON
(tblPatientTest.PolicyType = tblImports.PolicyType) AND
(tblPatientTest.dAcknowlegementSent = tblImports.dAcknowlegementSent) AND
(tblPatientTest.dSubmissionReceived = tblImports.dSubmissionReceived) AND
(tblPatientTest.Seller_ID = tblImports.Seller_ID) AND
(tblPatientTest.Policy_ID = tblImports.Policy_ID);

3/ Create another query, an Unmatched Query, which will be the same as the
first query, except you use RIGHT joins instead of equi-joins. Like so...

SELECT tblImports.Policy_ID, tblImports.Seller_ID,
tblImports.dSubmissionReceived, tblImports.dAcknowlegementSent,
tblImports.PolicyType FROM tblPatientTest RIGHT JOIN tblImports ON
(tblPatientTest.PolicyType = tblImports.PolicyType) AND
(tblPatientTest.dAcknowlegementSent = tblImports.dAcknowlegementSent) AND
(tblPatientTest.dSubmissionReceived = tblImports.dSubmissionReceived) AND
(tblPatientTest.Seller_ID = tblImports.Seller_ID) AND
(tblPatientTest.Policy_ID = tblImports.Policy_ID) WHERE
(((tblPatientTest.Policy_ID) Is Null));

That will list just the data that is different in one or other of the five
fields. (The principle of unmatched queries is simple. You right join on all
the fields you want to check... RIGHT JOIN tblImports ON
(tblPatientTest.PolicyType = tblImports.PolicyType)... And then select a
field from the table that is not supposed to contain the data. It that field
in null then the new data is not present in the destination table ...WHERE
(((tblPatientTest.Policy_ID) Is Null));

I ran this query against a table with 12,000 records and it more or less
instantly returned the 189 new rows that I had engineered in some test data.

HTH

Lambert

-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Rocky Smolin
Sent: Thursday, January 14, 2010 10:01 AM
To: 'Access Developers discussion and problem solving'
Subject: [AccessD] Fastest Way

Dear List:

I have a medical database into which lab tests are imported from a
spreadsheet (exported from a mainframe medical records database at a big
hospital).  Front and back ends are Access 2003. The table tblPatientTest
has (in addition to the autonumber PK) 5 fields:

Patient ID (FK)

Test ID (FK)

Test Date as Date/Time (indexed duplicates OK)

Test Time as long integer - value between 0000 and 2399 (indexed duplicates
OK)

Test Result - Text (indexed duplicates OK)

I'm using DAO to add the records to the table.  Imports will happen monthly
and the number of records will be on the order of 10,000-100,000 with mean
probably around 20k-40k.

As the tests are imported I want to check the table to see if the test
record is already present by finding a record which matches all 5 fields to
prevent duplicate test records from being entered.

Since the import will take place monthly, import time is not super critical
but before importing the user will run an Error Report which does all the
error checking that the import does without actually doing the import.  The
user then resolves the errors and when the Error report is clean, will run
the import.  So the error report may have to be run more than once to clean
the data.

It is possible that the user will try to import a spreadsheet that has
already been imported. So it's important to screen for duplicates. 

So I want to find the fastest method of checking for duplicate.

The obvious thing is rs.FindFirst on all five fields, but I'm thinking that
may take a really long time.

Another possibility would be to create a temporary table in the front end or
external mdb, and create indexed records by concatenating all five fields
and checking for dups on that table.  Probably faster than the first method
but creating the table could take a lot of time when the Patient Test table
has a few hundred thousand records in it.

Or I could add the field of concatenated fields to the table itself as the
records are added, indexed and maintain that.

Or some other method that I haven't thought of or don't know about.

Any advice welcome.

MTIA

Rocky Smolin

Beach Access Software

858-259-4334

www.e-z-mrp.com <http://www.e-z-mrp.com/> 

www.bchacc.com <http://www.bchacc.com/> 

--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com 

--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com

--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com

-- 
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com