[AccessD] Lookup Fields in Table Design

Mon Mar 29 22:42:10 CST 2004

Andy: You'll be happy to hear that this is my last post on this subject. I
had intended to rest after my last post, but since John has asked me to
clarify my position, I will do so, as succinctly as possible.

My remarks will fall under three topics: general response, how table field
lookups (TFLs) can compromise relational integrity, and a summary of my
failure analysis.

General Response
================
I suppose my mindset is different from my opponents. In mathematics, a
postulate, no matter how seductive, can be disproven by a single
counter-example. In engineering, a structural design, no matter how
beautiful, will have to be corrected or compensated for if a single failure
mode can be demonstrated.

The data we engineer is no less real than lemmas or steel. My first
responsibility is to the integrity of the data entrusted to me. Thus,
failure analysis matters to me.

Given a choice between a path that offers risk, no matter how slight, and
another which offers equivalent results with no risk, I will take the latter
every time. My duty to the data demands it.

Drew, you said on 3/26/2004 that "the problem is that Lookup fields do NOT
cause corruption.". I have now proven two cases where they do. You have made
your points. But your refusal to allow for even a caveat on this topic is
intellectually dishonest. You simply cannot claim that you are 100% correct.
Why is this so hard for you to admit?

But seriously, with your cavalier dismissal of potential causes of failure,
I'm glad you didn't choose to be an engineer. I sure as hell wouldn't want
to drive across any bridge you design. Of course data errors matter, if not
to you, then certainly to the owners of the data.

But, rather than follow my opponent's example of loudly and longly
belittling the other side, adding nothing but rehashed versions of what he's
said now dozens of times before, I have two NEW items to contribute.

How TFLs Compromise Relational Integrity
========================================
Take two US divisions of a company: East and West. This company had a better
data designer, who defined tblStates with three fields: ID - autonumber,
Code - text, and Name - text. Better still,  referential integrity was
established with the ID foreign key in tblAddresses. Of course, TFLs were
defined for the table/lookup pair.

The only small, tiny flaw is, the data was entered differently for each
division: East entered theirs in Name order, while West did theirs in State
Postal Code order. So now we have two tables, with identical logical
content, that have different IDs for most of the records. But both sets of
IDs are numbered 1 to 50.

So, the fateful day comes along when East wants to do a nationwide mailing,
and requests West's database. West sends their copy, and East simply links
West's tblAddresses into their front-end, creating tblAddresses1.

Opening up tblAddresses1, everything looks great: every record has a state
code -- no missing data. East does their mass mailing using a query off of
tblAddresses1, and gets back a huge pile of returns for bad addresses.

What has happened, of course, is the TFL in West's tblAddresses has
cross-linked with East's tblStates, with autonumber IDs that transposed the
state codes.

Naturally, if East decided to help West and enter new addresses, relational
integrity would allow it, because it cares only that the foreign key exists
in the proper lookup table, not what it represents.

The large potential weakness here is that if you look at any two tables with
autonumber keys, there is almost always a large overlap in autonumber ID
values.

Under the right conditions, if you cross-link a smaller table to a larger
table via TFLs, you might never notice until you have hose your data, ruined
an audit, etc.

So, TFLs can defeat even referential integrity.

TFL Failure Analysis
====================
In front-end/backend applications, TFLs fail when their external context
assumptions are violated.

Assumption 1: The external lookup table will always exist.
   Path of failure: You don't link the lookup table in the
       front-end, or rename the link. No table with the
       same name exists.
   Consequences: Intermittent errors. Queries relying on TFL
       don't show expected values.

Assumption 2: The external lookup table will always match the
       internal name.
   Path of failure: The correct lookup table is linked when
       another table of the same name exists, or is not
       linked when a table of the same name exists
Failure Mode 1:
   One or more fields referenced in the lookup SQL
   do not exist.
   Consequences: Errors. Lookup values not properly displayed.
Failure Mode 2:
   All fields referenced in the lookup SQL exist.
   Consequences: No errors. Inconsistent/incorrect data
       displayed.
        * If no referential integrity defined, data entry is
          possible using values from the wrong lookup table.
        * If referential integrity is defined on Autonumber
          ID's, integrity will be defeated where the ID in
          the wrong lookup table has a match in the
          correct one.