[AccessD] x64 tests

jwcolby jwcolby at colbyconsulting.com
Sun Jan 6 11:40:25 CST 2008


I don't really understand what you are doing.  I suppose you are saying just
get the select fields and process the "do they match the criteria" in code?

Before we go further remember that we are simply trying to select name
address fields for export, based on a set of criteria.  We have to use three
different tables:

HSID has a PK field and 640 fields of data.  I would need to select 50
million records of the PK and N criteria fields (2 age fields, 1 income
field and 8 hobby fields).  I also have a table AZHSID which has the PKID
and the name / address fields.  So I would need to get the PKID plus fname,
lname, city, state, zip and a HashFamily field, 50 million records of that
data.

Then I have ZipCodeWorld that has the zip code information that tells me
what zips are in a specific metropolitan code.

I have to build a query to join the HSID, AZHSID and Accuzip tables and pull
all of the fields, then process them in code?  I would also have to write
some sort of generic code to do the selection.

Of course there are many ways to perform the queries anyway, and I should go
back and build them differently just to see what happens.  I could have
simply built up one big query.  In the past I hesitated to do this because
the queries would take 20-30 minutes to run.  If anything went wrong I would
have to make changes and retest, waiting another 20-30 minutes.  I learned
to break things down into pieces so that the pieces only took a few minutes
each.

My client had counts that he thought represented the number of records for
each criteria.  I was trying to check counts on my tables against what he
thought he had.  He had a count of the age / sex, the income, and then each
hobby.  So I built up separate queries for those pieces so that I could
count the pieces.

In the end, the point of this exercise was not to optimize the process but
rather to see how moving from x32 to x64 changed the time required.
Unfortunately that part is masked by the disk subsystem so I didn't really
learn anything there.  Then I wanted to learn how moving from 4 gigs to 8
gigs changed the time required, and likewise how moving from dual to quad
core changed the time.  I really expected the doubling of memory to
substantially change the times, but it did not.  I also expected SQL Server
x64 to be able to natively handle large amounts of memory, not depend on AWE
which I understand to be "paging" of memory.

At this point I am actually quite happy with a 5 minute time to grab all the
data and get it out to a text file.  I think that for processing 50 million
records in two tables, and narrow it down to 600 thousand records using
about 10 select fields and three joins 5 minutes is doing OK.  Especially
since it was taking me 20-30 minutes before I moved to the "cover" queries.

I do think I will go back and built this as one big query and see how long
that takes.


John W. Colby
Colby Consulting
www.ColbyConsulting.com 
-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Shamil
Salakhetdinov
Sent: Sunday, January 06, 2008 9:11 AM
To: 'Access Developers discussion and problem solving'
Subject: Re: [AccessD] x64 tests

Hello John,

Did you ever try to just "pipe-through" your millions of records and
processing them in code? I mean nowadays memory and processors are so
incredibly quick, and your query (selection) criteria are looking rather
simple - therefore it could be (much) quicker to get results if using some
custom coding... 

I do not have millions records to test this hypothesis but here is a sample
C# code, which runs against a Db with 30000+ of member records and then
cycle 2500 times to get the counter to 80+ millions, and it does get data
into local vars for further processing  - the result is that it takes 3 min
30 sec for 89+ million records on "slow" dual core Pentium with 2GB ram and
simple 200GB IDE...

Please correct me if I did some mistakes below.

Thanks.

--
Shamil




More information about the AccessD mailing list