[dba-SQLServer] PK/ANPK Debate

Mon Mar 10 08:19:31 CDT 2014

Hi Arthur

I always use an ID. Postal codes do change, US states may not but who knows? And ISO codes - like those for the countries - do change.

It is just so much easier: A table, any table, has an Id. Period.

/gustav

-----Oprindelig meddelelse-----
Fra: dba-sqlserver-bounces at databaseadvisors.com [mailto:dba-sqlserver-bounces at databaseadvisors.com] På vegne af Arthur Fuller
Sendt: 10. marts 2014 13:43
Til: Discussion concerning MS SQL Server
Emne: [dba-SQLServer] PK/ANPK Debate

To start off the week, I thought that I would re-open the PK/ANPK debate, limiting the discussion to a specific group of tables: lookup tables.
Consider this example:

SCENARIO 1:
Table: Volunteers
PK: VolunteerID (autonumber)
Columns: Name, Address, etc.

Table: Languages
PK: LanguageID (autonumber)
Columns: Language(varchar(20)

Associative Table: VolunteerLanguages
PK: VolunteerID, LanguageID
Columns: VolunteerID, LanguageID, and perhaps Profiency

SCENARIO 2:
Table: Volunteers
PK: VolunteerID (autonumber)
Columns: Name, Address, etc.

Table: Languages
PK: Language
Columns: Language(varchar(20)

Associative Table: VolunteerLanguages
PK: VolunteerID, Language
Columns: VolunteerID(int), Language(varchar(20), and perhaps Profiency

Both scenarios exist to allow the recording of multi-lingual Volunteers.
The knee-jerk choice is to go with the first scenario, but I am calling this into question, on the grounds that it complicates queries, comboBoxes etc. unnecessarily. The name of the language is sufficient as a PK in itself. No one wants duplicate occurrences of any language name in this table.

In the associative table, the second Scenario takes up a little more space.
But surely in this age when laptops have multi-TB hard disks, this is of no real concern. So why complicate queries, combos, listboxes etc. with a pointless lookup and additional Join?

The same argument could be applied to many lookup tables, for example Countries? Why bother with an ANPK when the official ISO abbreviations exist, are easily obtained, and are universally recognized? Perhaps the argument is even stronger here, since the ISO codes are char(3)?

In case it isn't clear, I am not arguing for this design in the common
Customer+Orders+OrderItems situation, but only in the subset of tables
where there are a small number of rows whose text values must be unique. In the past, I have designed this subset according to the first scenario, Scenario 1 demands either a unique index on the text column or logic in the front end to achieve the same thing. In short, I no longer see the point.

--
Arthur