jwcolby
jwcolby at colbyconsulting.com
Thu Jul 19 14:15:54 CDT 2007
I have a situation where I import tens of millions of records of raw data (as I call it) into SQL Server. I turn right around and send the name / address out for processing and when I get that back I import that back into SQL Server in a separate table. Is there any advantage to bringing that second table (addresses) back in to the same database that I have the raw data in? The way I use the data is that I add a PK long autoincrement to the raw data table and export that with the name / address. Thus when the data comes back in from address processing each record coming back has a PKID that matches a record in the raw data table. When I build queries I often do a join on PKID between raw and address and then select fields from both. It appears to me that having them in separate database files is actually an advantage in many ways. For one, the data in the raw data file can be archived once and never touched again as far as backup is concerned. The data in that table never changes. Also it seems that I might be able to place the two files (raw and address) on separate machines and get some parallelism going in processing where clauses etc. It may in fact be exactly the opposite and I would NOT want to set the two databases on separate machines though, I am certainly no expert on anything SQL Server related. Any thoughts on the pros and cons here? John W. Colby Colby Consulting www.ColbyConsulting.com