Ken Ismert
KIsmert at TexasSystems.com
Mon Mar 29 22:42:10 CST 2004
Andy: You'll be happy to hear that this is my last post on this subject. I had intended to rest after my last post, but since John has asked me to clarify my position, I will do so, as succinctly as possible. My remarks will fall under three topics: general response, how table field lookups (TFLs) can compromise relational integrity, and a summary of my failure analysis. General Response ================ I suppose my mindset is different from my opponents. In mathematics, a postulate, no matter how seductive, can be disproven by a single counter-example. In engineering, a structural design, no matter how beautiful, will have to be corrected or compensated for if a single failure mode can be demonstrated. The data we engineer is no less real than lemmas or steel. My first responsibility is to the integrity of the data entrusted to me. Thus, failure analysis matters to me. Given a choice between a path that offers risk, no matter how slight, and another which offers equivalent results with no risk, I will take the latter every time. My duty to the data demands it. Drew, you said on 3/26/2004 that "the problem is that Lookup fields do NOT cause corruption.". I have now proven two cases where they do. You have made your points. But your refusal to allow for even a caveat on this topic is intellectually dishonest. You simply cannot claim that you are 100% correct. Why is this so hard for you to admit? But seriously, with your cavalier dismissal of potential causes of failure, I'm glad you didn't choose to be an engineer. I sure as hell wouldn't want to drive across any bridge you design. Of course data errors matter, if not to you, then certainly to the owners of the data. But, rather than follow my opponent's example of loudly and longly belittling the other side, adding nothing but rehashed versions of what he's said now dozens of times before, I have two NEW items to contribute. How TFLs Compromise Relational Integrity ======================================== Take two US divisions of a company: East and West. This company had a better data designer, who defined tblStates with three fields: ID - autonumber, Code - text, and Name - text. Better still, referential integrity was established with the ID foreign key in tblAddresses. Of course, TFLs were defined for the table/lookup pair. The only small, tiny flaw is, the data was entered differently for each division: East entered theirs in Name order, while West did theirs in State Postal Code order. So now we have two tables, with identical logical content, that have different IDs for most of the records. But both sets of IDs are numbered 1 to 50. So, the fateful day comes along when East wants to do a nationwide mailing, and requests West's database. West sends their copy, and East simply links West's tblAddresses into their front-end, creating tblAddresses1. Opening up tblAddresses1, everything looks great: every record has a state code -- no missing data. East does their mass mailing using a query off of tblAddresses1, and gets back a huge pile of returns for bad addresses. What has happened, of course, is the TFL in West's tblAddresses has cross-linked with East's tblStates, with autonumber IDs that transposed the state codes. Naturally, if East decided to help West and enter new addresses, relational integrity would allow it, because it cares only that the foreign key exists in the proper lookup table, not what it represents. The large potential weakness here is that if you look at any two tables with autonumber keys, there is almost always a large overlap in autonumber ID values. Under the right conditions, if you cross-link a smaller table to a larger table via TFLs, you might never notice until you have hose your data, ruined an audit, etc. So, TFLs can defeat even referential integrity. TFL Failure Analysis ==================== In front-end/backend applications, TFLs fail when their external context assumptions are violated. Assumption 1: The external lookup table will always exist. Path of failure: You don't link the lookup table in the front-end, or rename the link. No table with the same name exists. Consequences: Intermittent errors. Queries relying on TFL don't show expected values. Assumption 2: The external lookup table will always match the internal name. Path of failure: The correct lookup table is linked when another table of the same name exists, or is not linked when a table of the same name exists Failure Mode 1: One or more fields referenced in the lookup SQL do not exist. Consequences: Errors. Lookup values not properly displayed. Failure Mode 2: All fields referenced in the lookup SQL exist. Consequences: No errors. Inconsistent/incorrect data displayed. * If no referential integrity defined, data entry is possible using values from the wrong lookup table. * If referential integrity is defined on Autonumber ID's, integrity will be defeated where the ID in the wrong lookup table has a match in the correct one.