John W. Colby
jwcolby at colbyconsulting.com
Wed Sep 22 10:40:18 CDT 2004
In the nVLDB database I am working on, there are two sets of addresses, one that the original person supplied, another that comes from the post office when that person moves. I need to build a single table where I pull the zip (and later the rest of the fields) from the post office supplied field if there is any data in there, then where no PO supplied data, pull the data from the original address fields. I have a PK field in my big table, int, identity. In my new address table, I created a matching PK field which I set to be the PK, not identity but with a unique index. I then add a zip field and a byte Src field to hold a number that says which field the source data came from in the big table - 1=PO updated address data, 2=original data. I built a view to pull the PK and the PO supplied zip from the big table and a 1 for the SRC field, then built a query to append that to the new table I am building. That worked and I now have about 10 million records in this new table with data supplied by the Post Office saying the person moved and this is their new address. I built another query to pull the PK and the zip from the original address field. I THOUGHT I could just append that data into the new table and where there was a collision in the PK (the data was already in the new table from the Post Office) the record would drop on the floor, all others would go in the new table. This doesn't work for some reason. As soon as I try to append a record with a PK collision the whole thing aborts with a "violation of PK constraint" error message. FIRST QUESTION: Why does that happen and is there a way to cause it to continue processing so that the records where the PKs collide just drop out? Given the failure of that method, I thought I'd build a "not in" query like I would do in Access, where I would pull all the records where the PK in the Big table is not found in the new table. SQL Server didn't like my syntax on that one. So... I built a query where I joined the big table on the new table on PK with the <> operator in the join, figuring I would get a set of records from the big table where none of the PKs were in the new table. An hour later that query is still running.... It seems like such a simple thing. I'm sure you hear this a lot, but in Access I would have had it done hours ago. Well... I'd have accomplished the task hours ago. With 165 million records Access itself would NEVER have finished the task (which is why I'm here). So, how do I accomplish what I am after here. John W. Colby www.ColbyConsulting.com