jwcolby
jwcolby at colbyconsulting.com
Fri May 11 21:19:36 CDT 2007
Well guys, I have ported the application that transformed my raw data into a pipe delimited "csv" file - from VBA in Access to VB.NET. Preliminary results VERY crudely timed show about 10K records / second, up from about 1K records / second in VBA. I really need to find a timer class for timing code in order to get precise timings on this. As it stands now however, it looks like without any further optimizations, my raw data transform would be the bottleneck, running at 9.7K records / sec, with the SQL Server BULK INSERT running at 15K records / second. If I can get these two processes running in threads so that they process independently, I am now in a position to go pushbutton and import a 100 million record file in ~10K seconds / 166.6 minutes / 2.7 hours. Given that I will soon have the pieces to run unattended this is an acceptable rate for me. This process took days of handholding manual labor to make happen (learning stuff all the way of course) Thanks to all who have helped my in getting the ADO happening out in VB.Net, as well as the Sproc happening in SQL Server. I still have a long way to go to get a complete app in VB.Net and SQL Server. The piece I just ported does the open / parse / strip / write to move the raw fixed width file to pipe delimited. Phase 2 begins immediately. The next piece will automate running the BULK INSERT Sproc from VB.Net, given a set of pipe delimited csv files in a directory and an existing destination table in SQL Server. Once that piece is running, I will need to learn how to run each piece in a separate thread. When both pieces are running simultaneously, I will need to reassess the speed of each piece. I am using a dual proc AMD XP 3800 with 4 gb RAM so hopefully each thread will run on a different proc, though I don't know that. From the little I know I assume there is a way to set the processor affinity of a thread. So much to learn, so little time. But this is just an awesome start and I am happy with the speed gain of the transform process achieved by moving to VB.Net. Again thanks to all who contributed. John W. Colby Colby Consulting www.ColbyConsulting.com