[AccessD] SQL Server observed performance

JWColby jwcolby at colbyconsulting.com
Tue Oct 31 12:15:16 CST 2006


Of course the first question is, are we comparing apples to apples?  These
are records with ~700 fields, coming out of text files of 3 million records
each.

Having no experience with MySQL and very little with SQL Server, I don't
wish to imply that MySQL wouldn't be faster, in fact I have heard that it is
but without doing a direct comparison of the same file such things are less
than useful.

I could supply one of the input files if you want to do an import and see
what you come up with.  It is only ~350 mbytes zipped.  And then I could
work with an SQL Server guru to do whatever they think would optimize the
SQL Server side.  That would make an interesting comparison.

John W. Colby
Colby Consulting
www.ColbyConsulting.com

-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Gustav Brock
Sent: Tuesday, October 31, 2006 12:03 PM
To: accessd at databaseadvisors.com
Subject: Re: [AccessD] SQL Server observed performance

Hi John

Under some specific conditions you could easily increase import speed about
20 fold with MySQL:

http://databaseadvisors.com/mailman/htdig/accessd/2006-May/043865.html 

And "modest hardware" indeed: IBM PC300, 266 MHz CPU, 256 MB ram, one IDE
drive.

Of course, you would need to build indices etc. later, but still ...

/gustav

>>> jwcolby at colbyconsulting.com 31-10-2006 17:20:43 >>>
I thought you guys might find this interesting.  
 
I have a database that I imported a couple of years ago.  On a single
processor 3 ghz AMD64 running 2 mbytes of memory, using (4) individual IDE
250gb hard drives (no raid) the system would import ~ 1000 rows per second
into SQL Server.  Each text file was ~10 gbytes, consisted of ~700 fields
and 3 million records per file.  Each field was originally padded right with
spaces (comma delimited, but fixed width).
 
This time around, I built an Access (really just VBA) preprocessor to open
each file, read it line by line, strip all of the padding off the left and
right sides (there was some left padding as well) and write it back out to
another file.  This dropped the average text file size to ~ 6.5 gbytes,
which leaves us with average padding of well over 35%.  It also left the
resulting data in the unpadded after importing into SQL Server which makes
sorts / searches and indexes possible.
 
Anyway, after stripping all of this padding and building new files, I am now
importing these into my new server which is a AMD64 X2 dual processor 3.8
ghz with 2 gbytes of ram.  The disk subsystem is now a pair of volumes
hosted on a raid 6 array, 1 tbyte for the main data store and ~400 gb for
the temp databases.  The new system imports the new (stripped) data files at
about 3000 records per second.  I have to run 3 imports at a time to keep
both cores above 90% usage.  Running 3 imports at a time, the imports happen
roughly at 2k records / second FOR EACH IMPORT.
 
Oddly, if I run more than 4 imports at a time, the processor usage drops
back to ~70% for some reason and in fact each import slows to ~500 imports /
second.  This may have to do with the limits of disk streaming off of the
machine that holds the source text files.  The source files come from a
second machine, all the files on the same disk / directory, over a 1ghz
network (switch).
 
I am happy to say though that the new dual processor server appears to be
able to perform this specific task ~3 to 6 times as fast which is a huge and
much needed performance boost.  The other advantage to this configuration is
that I am no longer playing games splitting the database up into smaller
files residing on individual hard drives, and of course, the whole thing is
using raid 6 which provides much needed security.
 
John W. Colby
Colby Consulting
www.ColbyConsulting.com 


--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com




More information about the AccessD mailing list