jwcolby
jwcolby at colbyconsulting.com
Mon Jan 16 12:09:08 CST 2012
As I have discussed in the past, I have a set of about 8 tables (which I call AZData) containing name/address info. This table has just a PKID, FNAMe, LName, Addr, City, St, Zip5, Zip4. I have to maintain these records by sending them out every month for CASS / NCOA processing. I currently process about 350 million name/address records a month. The data always comes from bigger tables which contains demographics data, things like "owns a dog, owns a cat" ('Y'/'N' for each of these fields) or Child age 0-3, 4-7 etc (codes 1,2,or 3 indicating m/f/both). To this point I have always built a separate database to hold the original table with an autonumber PK, then I break the name/address out into the AZData table, linked back to the demographics table by an FK which is the PK in the demographics table. I then process that AZData table monthly. DogsAndCats contains about 11 million records, Kids contains 21 million records and so forth. I did it this way for a couple of reasons, an attempt to keep the physical size of the database down and to some extent simply history - that is the way I started doing this. Well, with 8 databases and 350 million records it is probably time to change. I am considering building a table with the name/addr fields and then a set of fields with an FK pointing back to each database where that name / address can be found. IOW the same name/addr may be in the kids and DogsAndCats table. So there would be a single name/addr record with an FK back to those two demographics tables. I would have a column for each demographics database's PK. This is denormalized, in the sense that as I add a new demographics table I have to add a new column to the name / address table to hold the PK linking records back to that demogr table. I normally consider any case where I have to add a field like this to be "denormalized" - kinda like Child1, Child2. The upside is that suddenly the demographics table link through a single M-M table (which happens to hold the name/addr info as well), which allows a very simple sql statement to determine whether a person with dogs also has kids etc. Any thoughts on the wisdom of denormalizing in this manner? -- John W. Colby Colby Consulting Reality is what refuses to go away when you do not believe in it