[dba-SQLServer] PK/ANPK Debate

Arthur Fuller fuller.artful at gmail.com
Mon Mar 10 07:43:16 CDT 2014


To start off the week, I thought that I would re-open the PK/ANPK debate,
limiting the discussion to a specific group of tables: lookup tables.
Consider this example:

SCENARIO 1:
Table: Volunteers
PK: VolunteerID (autonumber)
Columns: Name, Address, etc.

Table: Languages
PK: LanguageID (autonumber)
Columns: Language(varchar(20)

Associative Table: VolunteerLanguages
PK: VolunteerID, LanguageID
Columns: VolunteerID, LanguageID, and perhaps Profiency

SCENARIO 2:
Table: Volunteers
PK: VolunteerID (autonumber)
Columns: Name, Address, etc.

Table: Languages
PK: Language
Columns: Language(varchar(20)

Associative Table: VolunteerLanguages
PK: VolunteerID, Language
Columns: VolunteerID(int), Language(varchar(20), and perhaps Profiency

Both scenarios exist to allow the recording of multi-lingual Volunteers.
The knee-jerk choice is to go with the first scenario, but I am calling
this into question, on the grounds that it complicates queries, comboBoxes
etc. unnecessarily. The name of the language is sufficient as a PK in
itself. No one wants duplicate occurrences of any language name in this
table.

In the associative table, the second Scenario takes up a little more space.
But surely in this age when laptops have multi-TB hard disks, this is of no
real concern. So why complicate queries, combos, listboxes etc. with a
pointless lookup and additional Join?

The same argument could be applied to many lookup tables, for example
Countries? Why bother with an ANPK when the official ISO abbreviations
exist, are easily obtained, and are universally recognized? Perhaps the
argument is even stronger here, since the ISO codes are char(3)?

In case it isn't clear, I am not arguing for this design in the common
Customer+Orders+OrderItems situation, but only in the subset of tables
where there are a small number of rows whose text values must be unique. In
the past, I have designed this subset according to the first scenario,
Scenario 1 demands either a unique index on the text column or logic in the
front end to achieve the same thing. In short, I no longer see the point.

-- 
Arthur


More information about the dba-SQLServer mailing list