[AccessD] Primary Key Best Practices

Charlotte Foust cfoust at infostatsystems.com
Wed Jul 25 09:59:38 CDT 2007


Jim,

I'm going to flag your post for future reference ... and the next time
the moon is full!  <snicker>

Charlotte Foust 

-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Jim Dettman
Sent: Wednesday, July 25, 2007 7:42 AM
To: 'Access Developers discussion and problem solving'
Subject: Re: [AccessD] Primary Key Best Practices

John,

<<1) The PK should be stable (NEVER changes).  >>

  Don' agree with "never".  You want it as stable as possible, but there
is nothing in relational theory that says it can never change.  

<<Again, the PK and the unique index are NOT the same structure, either
in real life or in theory.>>

  That is simply not true.  With relational theory, the PK is derived
from the pool of CK's, which is based on the data.  In theory you cannot
have a PK without it being a unique index.  That's part of the
definition of what a natural PK is.  Out of the pool of CK's, you would
use the CK that is the most stable, shortest in length, and the least
complex (simple vs compound).

<<What works, and what is the best choice are different matters.  The
surrogate key avoids a whole slew of real life problems, and creates
none (in my experience).>>

  Surrogates only work in computer systems because we have systems that
are reliable.  If I gave you pen and paper and had you keep track of
data using the relational model, you'd be in a pretty big mess fairly
quick if you used surrogate keys.

  I don't object to the use of surrogate keys, but what I do object to
is the apples and oranges approach you use to claim what a PK is and
surrogate keys fit in the relational model.  Surrogate keys are nothing
more then pointers.  To call them a PK in regards to the relational
model is simply incorrect.

  Surrogate keys work well in computer systems, but the relational model
can be applied to much more then computer systems and the PK as you
describe it vs what it is in the relational model are not the same
thing.

Jim.


-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of jwcolby
Sent: Wednesday, July 25, 2007 9:18 AM
To: 'Access Developers discussion and problem solving'
Subject: Re: [AccessD] Primary Key Best Practices

Bruce, 

>Any table that does not have a natural primary key is not a pure 
>dataset,
...

No, Any table that does not have a natural primary key CANDIDATE is not
a pure dataset, ... 

A candidate key is exactly that, a field or set of fields that can serve
the purpose of uniquely identifying the data record, and thus becomes a
CANDIDATE to be the PK.  In fact you can have multiple candidates.
Nothing says you actually have to USE the candidate as the PK.  That is
again the difference between the unique index and the PK.  One of the
CANDIDATES is used as the unique index, but a surrogate can still be
used as the PK.

Take an employee table.  ONE candidate key could be first / last /
phone, another could be an employee number (a quasi-surrogate key by the
way), another could be the SSN.  Thus we have (at least) THREE CANDIDATE
keys.  Do you have to use all of them as the PK?  Obviously not.  Do you
have to use ANY of them as the PK?  Obviously not.  Throw in a true
surrogate key - an autoincrement - and you can use that as well.  So
which you use simply doesn't matter AS LONG AS the PK meets a few
criteria.

1) The PK should be stable (NEVER changes).  

That eliminates the first / last / phone idea because the name could
change (I got married, I got tired of my name) AND the phone number
could (and probably will) change.  The SSN fails for the same reason.
SSNs CAN (believe it or not) change.  Alien workers steal SSNs all the
time.  Now they become a citizen...  Oooops the SSN has to change.

2) Has no meaning.  From the perspective of the CHILD record, the FK is
nothing more than a pointer back to the parent record.  As such it
should be a behind the scenes, never directly viewed artificial
construct designed to do its job as efficiently as possible.

In fact, a surrogate key can (and DOES) uniquely identify a single row
of the table.  If it did not, then it could not be the PK.  As you know,
you cannot take a simple integer, make it a value of 1 for every record,
and make that the PK of the table.  It is the very fact that you make
the field autoincrement (or whatever) is what makes it possible to serve
as the PK.
However a surrogate cannot be used for the unique index, the purpose of
preventing duplicates.  OTOH it does not need to, its sole purpose in
live is to act as a pointer, placed in child records to point back to
itself.

Again, the PK and the unique index are NOT the same structure, either in
real life or in theory.  The purpose of the PK is to uniquely identify a
record so that it can be linked to a child record.  Nothing more,
nothing less.  You can do that with a natural key, and you can do that
with a surrogate.  The purpose of the unique index is to prevent
entering the same record in the database twice (data integrity).
Different function.  The fact that the fields used in the unique index
can also serve as a PK is irrelevant.  It does NOT HAVE TO BE the
primary key.

Think about normalization for a moment.  The basic concept behind
normalization is that ONLY information about a specific object is in the
table about that object.  Only bank info is in the bank table, only
customer info is in the customer table, only check info is in the check
table.  One goal of normalization should be to minimize the fluff around
lineage.  Yes a check was drawn by a specific customer, but that does
not mean that we have to have any specific piece of the customer table
in the check table, it only means that we have to be able to uniquely
identify which customer wrote the check.  So it simply does NOT MATTER
whether we use a SURROGATE KEY from the customer table, or the customer
SSN (the very WORST choice), the first /last / eye color / hair color /
phone / zip / and whatever else you might have decided was the CANDIDATE
KEY of the customer table.  Either one WORKS.  

What works, and what is the best choice are different matters.  The
surrogate key avoids a whole slew of real life problems, and creates
none (in my experience).

John W. Colby
Colby Consulting
www.ColbyConsulting.com
-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Bruce Bruen
Sent: Wednesday, July 25, 2007 8:12 AM
To: Access Developers discussion and problem solving
Subject: Re: [AccessD] Primary Key Best Practices

On Wednesday 25 July 2007 02:52, jwcolby wrote:
> ROTFLMAO.
>
> The fit never passes.  It just subsides until the moon rises again.
>
>
> John W. Colby

Any table that does not have a natural primary key is not a pure
dataset, ...


...


...

no, I'm not talking about natural v surrogate.


bruce
--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com

--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com

--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com




More information about the AccessD mailing list