JWColby
jwcolby at colbyconsulting.com
Tue Oct 31 12:15:16 CST 2006
Of course the first question is, are we comparing apples to apples? These are records with ~700 fields, coming out of text files of 3 million records each. Having no experience with MySQL and very little with SQL Server, I don't wish to imply that MySQL wouldn't be faster, in fact I have heard that it is but without doing a direct comparison of the same file such things are less than useful. I could supply one of the input files if you want to do an import and see what you come up with. It is only ~350 mbytes zipped. And then I could work with an SQL Server guru to do whatever they think would optimize the SQL Server side. That would make an interesting comparison. John W. Colby Colby Consulting www.ColbyConsulting.com -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Gustav Brock Sent: Tuesday, October 31, 2006 12:03 PM To: accessd at databaseadvisors.com Subject: Re: [AccessD] SQL Server observed performance Hi John Under some specific conditions you could easily increase import speed about 20 fold with MySQL: http://databaseadvisors.com/mailman/htdig/accessd/2006-May/043865.html And "modest hardware" indeed: IBM PC300, 266 MHz CPU, 256 MB ram, one IDE drive. Of course, you would need to build indices etc. later, but still ... /gustav >>> jwcolby at colbyconsulting.com 31-10-2006 17:20:43 >>> I thought you guys might find this interesting. I have a database that I imported a couple of years ago. On a single processor 3 ghz AMD64 running 2 mbytes of memory, using (4) individual IDE 250gb hard drives (no raid) the system would import ~ 1000 rows per second into SQL Server. Each text file was ~10 gbytes, consisted of ~700 fields and 3 million records per file. Each field was originally padded right with spaces (comma delimited, but fixed width). This time around, I built an Access (really just VBA) preprocessor to open each file, read it line by line, strip all of the padding off the left and right sides (there was some left padding as well) and write it back out to another file. This dropped the average text file size to ~ 6.5 gbytes, which leaves us with average padding of well over 35%. It also left the resulting data in the unpadded after importing into SQL Server which makes sorts / searches and indexes possible. Anyway, after stripping all of this padding and building new files, I am now importing these into my new server which is a AMD64 X2 dual processor 3.8 ghz with 2 gbytes of ram. The disk subsystem is now a pair of volumes hosted on a raid 6 array, 1 tbyte for the main data store and ~400 gb for the temp databases. The new system imports the new (stripped) data files at about 3000 records per second. I have to run 3 imports at a time to keep both cores above 90% usage. Running 3 imports at a time, the imports happen roughly at 2k records / second FOR EACH IMPORT. Oddly, if I run more than 4 imports at a time, the processor usage drops back to ~70% for some reason and in fact each import slows to ~500 imports / second. This may have to do with the limits of disk streaming off of the machine that holds the source text files. The source files come from a second machine, all the files on the same disk / directory, over a 1ghz network (switch). I am happy to say though that the new dual processor server appears to be able to perform this specific task ~3 to 6 times as fast which is a huge and much needed performance boost. The other advantage to this configuration is that I am no longer playing games splitting the database up into smaller files residing on individual hard drives, and of course, the whole thing is using raid 6 which provides much needed security. John W. Colby Colby Consulting www.ColbyConsulting.com -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com