Jim Lawrence (AccessD)
accessd at shaw.ca
Sun Sep 5 23:00:46 CDT 2004
Hi John: I Have been following the data saga and have been quiet enjoying it. Not it a sadistic way but in a sympathetic way. I am glad it is you and not me. Here is some observations and suggestions. 1. Why not import the data is chunks. Clean up each chunk by change the 'Y' and 'N' fields to True and False (1,0). Then port the data to the finally database. This could even be on another computer. Start partially breaking the data up along the lines described in that Warehousing article I posted a couple of days ago. This process could be accomplished by a SP. The Database/tables holding the cleaned up data would be far smaller than the raw imported data. The conversion would take longer to get all the data imported but in actual fact the whole process would be shorter. (Note if you are using a couple of computers connected across a LAN, a Switch is a better choice that a Hub.) 2. Do not even think of putting keys on the tables until all the data has been imported. Nothing consumes resources and slows performance more than having to manage keys when massive amounts of data is being imported. 3. Trying to do an exact match is almost impossible. There will be a number of matches but then there is the data where the person has changed their name, their name has been misspelled, a number of times, they have moved residence etc..etc... A lot of the matching can be done by the programs but it comes down grunt work...like ten cloistered data-clerks each with a 5 year contract. These comments may have already been covered or totally redundant but I hope there is something there that will help out. Jim -----Original Message----- From: dba-sqlserver-bounces at databaseadvisors.com [mailto:dba-sqlserver-bounces at databaseadvisors.com]On Behalf Of John W. Colby Sent: Sunday, September 05, 2004 6:41 PM To: dba-sqlserver at databaseadvisors.com Subject: RE: [dba-SQLServer] Quotes in data part II BTW my BCP query looks like: BULK INSERT conduit..conduit FROM 'z:\Conduit001.txt' WITH ( DATAFILETYPE = 'char', FIELDTERMINATOR = ',', ROWTERMINATOR = '\n' ) John W. Colby www.ColbyConsulting.com -----Original Message----- From: dba-sqlserver-bounces at databaseadvisors.com [mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of John W. Colby Sent: Sunday, September 05, 2004 8:37 PM To: dba-sqlserver at databaseadvisors.com Subject: [dba-SQLServer] Quotes in data part II I assume that the quotes are a valid strategy to allow commas to be embedded in the data, i.e. if the comma is inside a pair of quotes it is data, if it is outside, it is a field delimiter? Argh. Sigh. Beats head against sharp corner of filing cabinet. John W. Colby www.ColbyConsulting.com _______________________________________________ dba-SQLServer mailing list dba-SQLServer at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/dba-sqlserver http://www.databaseadvisors.com _______________________________________________ dba-SQLServer mailing list dba-SQLServer at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/dba-sqlserver http://www.databaseadvisors.com