[dba-SQLServer] Quotes in data part II

Jim Lawrence (AccessD) accessd at shaw.ca
Sun Sep 5 23:00:46 CDT 2004


Hi John:

I Have been following the data saga and have been quiet enjoying it. Not it
a sadistic way but in a sympathetic way. I am glad it is you and not me.

Here is some observations and suggestions.

1. Why not import the data is chunks. Clean up each chunk by change the 'Y'
and 'N' fields to True and False (1,0). Then port the data to the finally
database. This could even be on another computer. Start partially breaking
the data up along the lines described in that Warehousing article I posted a
couple of days ago. This process could be accomplished by a SP. The
Database/tables holding the cleaned up data would be far smaller than the
raw imported data. The conversion would take longer to get all the data
imported but in actual fact the whole process would be shorter. (Note if you
are using a couple of computers connected across a LAN, a Switch is a better
choice that a Hub.)

2. Do not even think of putting keys on the tables until all the data has
been imported. Nothing consumes resources and slows performance more than
having to manage keys when massive amounts of data is being imported.

3. Trying to do an exact match is almost impossible. There will be a number
of matches but then there is the data where the person has changed their
name, their name has been misspelled, a number of times, they have moved
residence etc..etc... A lot of the matching can be done by the programs but
it comes down grunt work...like ten cloistered data-clerks each with a 5
year contract.

These comments may have already been covered or totally redundant but I hope
there is something there that will help out.

Jim

-----Original Message-----
From: dba-sqlserver-bounces at databaseadvisors.com
[mailto:dba-sqlserver-bounces at databaseadvisors.com]On Behalf Of John W.
Colby
Sent: Sunday, September 05, 2004 6:41 PM
To: dba-sqlserver at databaseadvisors.com
Subject: RE: [dba-SQLServer] Quotes in data part II


BTW my BCP query looks like:

BULK INSERT conduit..conduit FROM 'z:\Conduit001.txt'
WITH (
   DATAFILETYPE = 'char',
   FIELDTERMINATOR = ',',
   ROWTERMINATOR = '\n'
)

John W. Colby
www.ColbyConsulting.com

-----Original Message-----
From: dba-sqlserver-bounces at databaseadvisors.com
[mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of John W.
Colby
Sent: Sunday, September 05, 2004 8:37 PM
To: dba-sqlserver at databaseadvisors.com
Subject: [dba-SQLServer] Quotes in data part II


I assume that the quotes are a valid strategy to allow commas to be embedded
in the data, i.e. if the comma is inside a pair of quotes it is data, if it
is outside, it is a field delimiter?

Argh.  Sigh.  Beats head against sharp corner of filing cabinet.

John W. Colby
www.ColbyConsulting.com


_______________________________________________
dba-SQLServer mailing list
dba-SQLServer at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
http://www.databaseadvisors.com




_______________________________________________
dba-SQLServer mailing list
dba-SQLServer at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
http://www.databaseadvisors.com




More information about the dba-SQLServer mailing list