[dba-SQLServer] Bulk insert

MartyConnelly martyconnelly at shaw.ca
Wed May 9 13:04:21 CDT 2007


Another option is to use SqlBulkCopy a class that comes with Net 
Framework 2.0
There is a SqlBulkCopy example in the book below that uses a CSV import.
Ttricky to setup but it works.  This new 2.0 class is designed to call the
SQLSMO layer underneath the covers--that replaces SQL DMO.
William Vaughn's Hitchhiker's Guide to Visual Studio and SQL Server (7th 
Edition)

Here is C## code example of calling using a datareader stream into 
SQLBulkCopy
This handles 40,000 rows a second, 8000 if you apply indexing

private void button1_Click(object sender, EventArgs e)
{
        Stopwatch sw = new Stopwatch();
        sw.Start();
        DataTable table = new DataTable();
        table.Columns.Add(new DataColumn("File",typeof(string)));
        table.Columns.Add(new DataColumn("IID",typeof(int)));
        table.Columns[1].AutoIncrement = true;
        table.Columns[1].AutoIncrementSeed = 1;
        table.Columns[1].AutoIncrementStep = 1;

        StreamReader sr = new StreamReader("c:\\filelist.txt");
        while (!sr.EndOfStream ){
                table.Rows.Add(sr.ReadLine());                          
        }

        sw.Stop();
        Debug.Write(sw.Elapsed.TotalSeconds + " seconds for " + 
table.Rows.Count +
" = " + Convert.ToDouble(table.Rows.Count) / sw.Elapsed.TotalSeconds + " 
rows
per second loaded to datatable");
        sw.Start();

        SqlConnection sqlcon = new SqlConnection("data 
source=lon0371xns;initial
catalog=SonarBackup;integrated security=sspi");
        SqlBulkCopy bc = new SqlBulkCopy(sqlcon);
        bc.DestinationTableName = "FileList";
        bc.NotifyAfter = 5000;
        bc.SqlRowsCopied += new 
SqlRowsCopiedEventHandler(bc_SqlRowsCopied);
        bc.ColumnMappings.Add(new SqlBulkCopyColumnMapping("File", 
"File"));
        bc.ColumnMappings.Add(new SqlBulkCopyColumnMapping("IID", "IID"));
        sqlcon.Open();
        bc.BulkCopyTimeout = 500;
        bc.WriteToServer(table);
        sw.Stop();

        Debug.Write(sw.Elapsed.TotalSeconds + " seconds for " + 
table.Rows.Count +
" = " + Convert.ToDouble(table.Rows.Count) / sw.Elapsed.TotalSeconds + " 
rows
per second loaded to db");



}



JWColby wrote:

>>The only caveat here is if you have empty fields in your file, a single
>>    
>>
>space is inserted instead of a null.
>
>What is it with all "the only caveat here" stuff?  I am sure that there is a
>darned good reason.
>
>In the end it is just easier to roll your own rather than work around the
>issues that the built in stuff seems to have.  I have 150 fields (in this
>data set).  Somehow I have to do an update on all 150 fields.  I suppose I
>could have my converter run 150 update queries to do each column.  Or 700
>update queries to do the next data set.  Or just do the stripping of the
>spaces external to SQL Server and be done with it.  Either way I still have
>to use my toy.
>
>Once I move up to VB.Net I will be able to use threads to do the stripping
>and the BULK INSERT Sproc in parallel.
>
>BTW, I have to do something very similar all over again once I get the data
>in.  I will need to export the entire table back out, 2 million record sets
>of data to delimited files for CAS / NCOA processing, dumping 100 million
>records out into ~50 files (just the address data this time).  The CAS /
>NCOA process theoretically will process all files placed into an input
>directory (input to that program), dumping the processed files into an
>output directory (output from that program).  At which point I have to pull
>all of the CASS / NCOAd files BACK out of that output directory into to yet
>another table.  And that is just the "pre-processing".
>
>You might be getting a clue by now why I do not want to be manually doing
>all the crapola involved with the solutions that do not involve an external
>control process.  Someday fairly soon I will have a completely automated
>system for doing all this.  I will be back to blowing bubbles and poking at
>Charlotte with a big stick.
>
>John W. Colby
>Colby Consulting
>www.ColbyConsulting.com
>
>-----Original Message-----
>From: dba-sqlserver-bounces at databaseadvisors.com
>[mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of
>MartyConnelly
>Sent: Tuesday, May 08, 2007 10:27 PM
>To: dba-sqlserver at databaseadvisors.com
>Subject: Re: [dba-SQLServer] Bulk insert
>
>Uhh, there is a one line fix to remove trailing blanks in SQL, different
>defaults for SQL Server versions and NChar and VChar.
>
>SET ANSI_PADDING OFF
>
> When a table is created with the setting turned on (the default), spaces
>are not trimmed when data is inserted into that table. When ANSI_PADDING is
>off, the spaces are trimmed.
>
>So if you SET ANSI_PADDING OFF, create your table, then set it back on
>again, when you bcp the data into the table, the excess trailing spaces will
>be eliminated. The only caveat here is if you have empty fields in your
>file, a single space is inserted instead of a null. If this is the case with
>your data file, you will need to do an update to set columns to null when
>len(yourcolumn) = 0.
>
>See BOL
>http://msdn2.microsoft.com/en-us/library/ms188340.aspx
>
>http://msdn2.microsoft.com/en-us/library/ms187403.aspx
>
>_______________________________________________
>dba-SQLServer mailing list
>dba-SQLServer at databaseadvisors.com
>http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
>http://www.databaseadvisors.com
>
>
>
>  
>

-- 
Marty Connelly
Victoria, B.C.
Canada




More information about the dba-SQLServer mailing list