JWColby
jwcolby at colbyconsulting.com
Wed May 9 13:33:49 CDT 2007
Marty, My only question about this method is the loading of all the records at one go before processing. Remember that I am doing raw files that can be 4 gigabytes of text, up to (in this case) 4 million records and (in this case) 149 fields. These files are HUGE by desktop standards. My method uses a single line read / process / write and thus is pretty much guaranteed to handle any size file, any number of records, any number of fields. John W. Colby Colby Consulting www.ColbyConsulting.com -----Original Message----- From: dba-sqlserver-bounces at databaseadvisors.com [mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of MartyConnelly Sent: Wednesday, May 09, 2007 2:04 PM To: dba-sqlserver at databaseadvisors.com Subject: Re: [dba-SQLServer] Bulk insert Another option is to use SqlBulkCopy a class that comes with Net Framework 2.0 There is a SqlBulkCopy example in the book below that uses a CSV import. Ttricky to setup but it works. This new 2.0 class is designed to call the SQLSMO layer underneath the covers--that replaces SQL DMO. William Vaughn's Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition) Here is C## code example of calling using a datareader stream into SQLBulkCopy This handles 40,000 rows a second, 8000 if you apply indexing private void button1_Click(object sender, EventArgs e) { Stopwatch sw = new Stopwatch(); sw.Start(); DataTable table = new DataTable(); table.Columns.Add(new DataColumn("File",typeof(string))); table.Columns.Add(new DataColumn("IID",typeof(int))); table.Columns[1].AutoIncrement = true; table.Columns[1].AutoIncrementSeed = 1; table.Columns[1].AutoIncrementStep = 1; StreamReader sr = new StreamReader("c:\\filelist.txt"); while (!sr.EndOfStream ){ table.Rows.Add(sr.ReadLine()); } sw.Stop(); Debug.Write(sw.Elapsed.TotalSeconds + " seconds for " + table.Rows.Count + " = " + Convert.ToDouble(table.Rows.Count) / sw.Elapsed.TotalSeconds + " rows per second loaded to datatable"); sw.Start(); SqlConnection sqlcon = new SqlConnection("data source=lon0371xns;initial catalog=SonarBackup;integrated security=sspi"); SqlBulkCopy bc = new SqlBulkCopy(sqlcon); bc.DestinationTableName = "FileList"; bc.NotifyAfter = 5000; bc.SqlRowsCopied += new SqlRowsCopiedEventHandler(bc_SqlRowsCopied); bc.ColumnMappings.Add(new SqlBulkCopyColumnMapping("File", "File")); bc.ColumnMappings.Add(new SqlBulkCopyColumnMapping("IID", "IID")); sqlcon.Open(); bc.BulkCopyTimeout = 500; bc.WriteToServer(table); sw.Stop(); Debug.Write(sw.Elapsed.TotalSeconds + " seconds for " + table.Rows.Count + " = " + Convert.ToDouble(table.Rows.Count) / sw.Elapsed.TotalSeconds + " rows per second loaded to db"); } JWColby wrote: >>The only caveat here is if you have empty fields in your file, a single >> >> >space is inserted instead of a null. > >What is it with all "the only caveat here" stuff? I am sure that there is a >darned good reason. > >In the end it is just easier to roll your own rather than work around the >issues that the built in stuff seems to have. I have 150 fields (in this >data set). Somehow I have to do an update on all 150 fields. I suppose I >could have my converter run 150 update queries to do each column. Or 700 >update queries to do the next data set. Or just do the stripping of the >spaces external to SQL Server and be done with it. Either way I still have >to use my toy. > >Once I move up to VB.Net I will be able to use threads to do the stripping >and the BULK INSERT Sproc in parallel. > >BTW, I have to do something very similar all over again once I get the data >in. I will need to export the entire table back out, 2 million record sets >of data to delimited files for CAS / NCOA processing, dumping 100 million >records out into ~50 files (just the address data this time). The CAS / >NCOA process theoretically will process all files placed into an input >directory (input to that program), dumping the processed files into an >output directory (output from that program). At which point I have to pull >all of the CASS / NCOAd files BACK out of that output directory into to yet >another table. And that is just the "pre-processing". > >You might be getting a clue by now why I do not want to be manually doing >all the crapola involved with the solutions that do not involve an external >control process. Someday fairly soon I will have a completely automated >system for doing all this. I will be back to blowing bubbles and poking at >Charlotte with a big stick. > >John W. Colby >Colby Consulting >www.ColbyConsulting.com > >-----Original Message----- >From: dba-sqlserver-bounces at databaseadvisors.com >[mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of >MartyConnelly >Sent: Tuesday, May 08, 2007 10:27 PM >To: dba-sqlserver at databaseadvisors.com >Subject: Re: [dba-SQLServer] Bulk insert > >Uhh, there is a one line fix to remove trailing blanks in SQL, different >defaults for SQL Server versions and NChar and VChar. > >SET ANSI_PADDING OFF > > When a table is created with the setting turned on (the default), spaces >are not trimmed when data is inserted into that table. When ANSI_PADDING is >off, the spaces are trimmed. > >So if you SET ANSI_PADDING OFF, create your table, then set it back on >again, when you bcp the data into the table, the excess trailing spaces will >be eliminated. The only caveat here is if you have empty fields in your >file, a single space is inserted instead of a null. If this is the case with >your data file, you will need to do an update to set columns to null when >len(yourcolumn) = 0. > >See BOL >http://msdn2.microsoft.com/en-us/library/ms188340.aspx > >http://msdn2.microsoft.com/en-us/library/ms187403.aspx > >_______________________________________________ >dba-SQLServer mailing list >dba-SQLServer at databaseadvisors.com >http://databaseadvisors.com/mailman/listinfo/dba-sqlserver >http://www.databaseadvisors.com > > > > > -- Marty Connelly Victoria, B.C. Canada _______________________________________________ dba-SQLServer mailing list dba-SQLServer at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/dba-sqlserver http://www.databaseadvisors.com