[dba-SQLServer] VB.Net - Raw data file transform

jwcolby jwcolby at colbyconsulting.com
Fri May 11 21:19:36 CDT 2007


Well guys, I have ported the application that transformed my raw data into a
pipe delimited "csv" file - from VBA in Access to VB.NET.  Preliminary
results VERY crudely timed show about 10K records / second, up from about 1K
records / second in VBA.  I really need to find a timer class for timing
code in order to get precise timings on this.  

As it stands now however, it looks like without any further optimizations,
my raw data transform would be the bottleneck, running at 9.7K records /
sec, with the SQL Server BULK INSERT running at 15K records / second.

If I can get these two processes running in threads so that they process
independently, I am now in a position to go pushbutton and import a 100
million record file in ~10K seconds / 166.6 minutes / 2.7 hours.

Given that I will soon have the pieces to run unattended this is an
acceptable rate for me.  This process took days of handholding manual labor
to make happen (learning stuff all the way of course)

Thanks to all who have helped my in getting the ADO happening out in VB.Net,
as well as the Sproc happening in SQL Server.  I still have a long way to go
to get a complete app in VB.Net and SQL Server.  The piece I just ported
does the open / parse / strip / write to move the raw fixed width file to
pipe delimited.  

Phase 2 begins immediately.  The next piece will automate running the BULK
INSERT Sproc from VB.Net, given a set of pipe delimited csv files in a
directory and an existing destination table in SQL Server.

Once that piece is running, I will need to learn how to run each piece in a
separate thread.  When both pieces are running simultaneously, I will need
to reassess the speed of each piece.  I am using a dual proc AMD XP 3800
with 4 gb RAM so hopefully each thread will run on a different proc, though
I don't know that.  From the little I know I assume there is a way to set
the processor affinity of a thread.

So much to learn, so little time.  But this is just an awesome start and I
am happy with the speed gain of the transform process achieved by moving to
VB.Net.

Again thanks to all who contributed.

John W. Colby
Colby Consulting
www.ColbyConsulting.com




More information about the dba-SQLServer mailing list