JWColby
jwcolby at colbyconsulting.com
Thu Nov 23 21:11:16 CST 2006
Given a table of 64 million records and growing, I need to do the following: Export sets of records out into csv files. The CSV files need to have specific field names, some of them unused which will be filled in by an NCOA (change of address) process outside of SQL Server. The sets need to be ~5 million records though it may be bigger with tuning. The table which this data comes from has an incrementing integer PK but about 15% of the records (PKs) have been "deleted". Thus the count can not depend on the PK. The table will grow, thus the number of exported csv files will grow. The CSV files need to be dropped into a specific "in" directory. The CSV files placed in the "NCOA IN" directory will be sensed by a program that performs NCOA processing on them, dropping files that have been NCOAd back into an "NCOA OUT" directory. The processing of each 5 million record CSV file by the NCOA program takes approximately 3 hours or more, depending on time of day. NCOA processing will take place weekly and must be automated so that it just happens. The presence of files in the NCOA "out" directory needs to trigger an import back into SQL Server and processing inside of SQL Server, i.e. affecting other tables. The process inside of SQL Server basically consists of sensing which records have been NCOAd, generate a "match code" from the leading N characters of the Address, city, zip5 and zip4. If that match code does not exist in a Master Address table, a new address record must be created in the master "address" table, and a FK updated in a m-m Person/Address table to point to the new address. IOW track addresses as a person moves around. I would like to do this from within SQL Server. I can and will if necessary do this from Access until I can come up to speed on VB.Net and do it from there. My question to you folks is whether this is possible completely inside of SQL Server? I suspect not but thought I would ask the masters. BTW, while I cannot say I am getting even so much as proficient with SQL Server, I do have a stable server up now, up to the task of doing this processing, with almost a terabyte of space for the main database and another 800g of disk for log files and the like, all running on a very fast raid6 drive. I am using Windows 2003 Server Standard edition and SQL Server 2000. Given what I am trying to accomplish, I am looking at moving towards SQL Server 2005 and VBA.Net 2005. Can anyone doing anything remotely like this comment on whether this should be split into parts, with the NCOA process running off on another server ( I have one available, though not as powerful) which would run the VB.Net stuff, do the NCOA processing, just manipulating the data over in the SQL Server. Also can SQL Server 2005 and SQL Server 2000 coexist on the same machine? If so is there anything I need to know before attempting to do the 2005 install on the SQL Server 2000 server? John W. Colby Colby Consulting www.ColbyConsulting.com