[AccessD] Primary Key Best Practices

jwcolby jwcolby at colbyconsulting.com
Wed Jul 25 10:19:00 CDT 2007


Jim,

>Surrogate keys are nothing more then pointers.  

Correct!!!!!!!!!!!!!!!!!!!!!!!!!!

<IMPORTANT>

This is a logical construct Jim.  You accept that a PK can be a surrogate,
you state that a PK is nothing but a pointer, therefore you accept (and
state) that a PK is nothing but a pointer.

<IMPORTANT/>

AND... from the perspective of the child, a FK is nothing but a pointer.
Does a table HAVE TO HAVE a PK at all?  Not if it has no children.  Ergo,
what matters from the child's perspective is what the PK is all about, and
that is simply a POINTER back to the parent.  AND... Since a surrogate key
functions as a PK, then... A PK must be nothing but a pointer...

>  Don' agree with "never".  You want it as stable as possible, but there is
nothing in relational theory that says it can never change.  

And I don't give a rat's patuty about relational theory, I care about what
works.  Relational theory says you should normalize to 16 (IIRC) different
levels.  When was the last time you did that?  Do you even UNDERSTAND 9th
(or 12th, or 16th) normal form?  OK YOU probably do but I don't and don't
care that I don't.  So day in and day out you (or at least I) ignore
relational theory and yet you attempt to flaunt that same theory in the
great PK debate for what purpose???

Having the PK change is a no-no.  Sorry, it just is.  It causes a whole host
of issues and given that it doesn't HAVE to change....

>That is simply not true.  With relational theory, the PK is derived from
the pool of CK's, which is based on the data.

And I don't give a rat's patuty about relational theory, I care about what
works.

>Surrogates only work in computer systems because we have systems that are
reliable.

But we do, so what's your point?  Natural keys only work (in large systems)
because we have systems that are reliable.  Imagine trying to manually
correct 100 million records where something got corrupted.  No easier
(without a reliable system).

I have NEVER used paper and pencil in a database so who really cares.

>To call them a PK in regards to the relational model is simply incorrect.

OK, and who cares?  I simply do not care about the relational model (in this
debate), I care about what works efficiently.  I use what works (of the
model) and I abandon what is inefficient (natural keys).  Natural keys are
NOT A REQUIREMENT, you admit that yourself.

I still have a PK, it is just a surrogate.  If you accept that surrogates do
work then why does the insistence in relational model PK being a natural key
even matter?

You are nitpicking to death stuff that simply does not matter.  If you want
to insist that the PK HAS TO BE a natural key, then how can you say in the
next breath that 

"I don't object to the use of surrogate keys"

C'mon Jim, give it up.  I will sign a piece of paper stating whatever you
want about PKs and natural keys and relational theory, I simply DO NOT CARE.
What I care about is that they are inefficient and cause problems that
surrogate keys avoid.

Happy?

John W. Colby
Colby Consulting
www.ColbyConsulting.com 
-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Jim Dettman
Sent: Wednesday, July 25, 2007 10:42 AM
To: 'Access Developers discussion and problem solving'
Subject: Re: [AccessD] Primary Key Best Practices

John,

<<1) The PK should be stable (NEVER changes).  >>

  Don' agree with "never".  You want it as stable as possible, but there is
nothing in relational theory that says it can never change.  

<<Again, the PK and the unique index are NOT the same structure, either in
real life or in theory.>>

  That is simply not true.  With relational theory, the PK is derived from
the pool of CK's, which is based on the data.  In theory you cannot have a
PK without it being a unique index.  That's part of the definition of what a
natural PK is.  Out of the pool of CK's, you would use the CK that is the
most stable, shortest in length, and the least complex (simple vs compound).

<<What works, and what is the best choice are different matters.  The
surrogate key avoids a whole slew of real life problems, and creates none
(in my experience).>>

  Surrogates only work in computer systems because we have systems that are
reliable.  If I gave you pen and paper and had you keep track of data using
the relational model, you'd be in a pretty big mess fairly quick if you used
surrogate keys.

  I don't object to the use of surrogate keys, but what I do object to is
the apples and oranges approach you use to claim what a PK is and surrogate
keys fit in the relational model.  Surrogate keys are nothing more then
pointers.  To call them a PK in regards to the relational model is simply
incorrect.

  Surrogate keys work well in computer systems, but the relational model can
be applied to much more then computer systems and the PK as you describe it
vs what it is in the relational model are not the same thing.

Jim.


-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of jwcolby
Sent: Wednesday, July 25, 2007 9:18 AM
To: 'Access Developers discussion and problem solving'
Subject: Re: [AccessD] Primary Key Best Practices

Bruce, 

>Any table that does not have a natural primary key is not a pure 
>dataset,
...

No, Any table that does not have a natural primary key CANDIDATE is not a
pure dataset, ... 

A candidate key is exactly that, a field or set of fields that can serve the
purpose of uniquely identifying the data record, and thus becomes a
CANDIDATE to be the PK.  In fact you can have multiple candidates.  Nothing
says you actually have to USE the candidate as the PK.  That is again the
difference between the unique index and the PK.  One of the CANDIDATES is
used as the unique index, but a surrogate can still be used as the PK.

Take an employee table.  ONE candidate key could be first / last / phone,
another could be an employee number (a quasi-surrogate key by the way),
another could be the SSN.  Thus we have (at least) THREE CANDIDATE keys.  Do
you have to use all of them as the PK?  Obviously not.  Do you have to use
ANY of them as the PK?  Obviously not.  Throw in a true surrogate key - an
autoincrement - and you can use that as well.  So which you use simply
doesn't matter AS LONG AS the PK meets a few criteria.

1) The PK should be stable (NEVER changes).  

That eliminates the first / last / phone idea because the name could change
(I got married, I got tired of my name) AND the phone number could (and
probably will) change.  The SSN fails for the same reason.  SSNs CAN
(believe it or not) change.  Alien workers steal SSNs all the time.  Now
they become a citizen...  Oooops the SSN has to change.

2) Has no meaning.  From the perspective of the CHILD record, the FK is
nothing more than a pointer back to the parent record.  As such it should be
a behind the scenes, never directly viewed artificial construct designed to
do its job as efficiently as possible.

In fact, a surrogate key can (and DOES) uniquely identify a single row of
the table.  If it did not, then it could not be the PK.  As you know, you
cannot take a simple integer, make it a value of 1 for every record, and
make that the PK of the table.  It is the very fact that you make the field
autoincrement (or whatever) is what makes it possible to serve as the PK.
However a surrogate cannot be used for the unique index, the purpose of
preventing duplicates.  OTOH it does not need to, its sole purpose in live
is to act as a pointer, placed in child records to point back to itself.

Again, the PK and the unique index are NOT the same structure, either in
real life or in theory.  The purpose of the PK is to uniquely identify a
record so that it can be linked to a child record.  Nothing more, nothing
less.  You can do that with a natural key, and you can do that with a
surrogate.  The purpose of the unique index is to prevent entering the same
record in the database twice (data integrity).  Different function.  The
fact that the fields used in the unique index can also serve as a PK is
irrelevant.  It does NOT HAVE TO BE the primary key.

Think about normalization for a moment.  The basic concept behind
normalization is that ONLY information about a specific object is in the
table about that object.  Only bank info is in the bank table, only customer
info is in the customer table, only check info is in the check table.  One
goal of normalization should be to minimize the fluff around lineage.  Yes a
check was drawn by a specific customer, but that does not mean that we have
to have any specific piece of the customer table in the check table, it only
means that we have to be able to uniquely identify which customer wrote the
check.  So it simply does NOT MATTER whether we use a SURROGATE KEY from the
customer table, or the customer SSN (the very WORST choice), the first /last
/ eye color / hair color / phone / zip / and whatever else you might have
decided was the CANDIDATE KEY of the customer table.  Either one WORKS.  

What works, and what is the best choice are different matters.  The
surrogate key avoids a whole slew of real life problems, and creates none
(in my experience).

John W. Colby
Colby Consulting
www.ColbyConsulting.com
-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Bruce Bruen
Sent: Wednesday, July 25, 2007 8:12 AM
To: Access Developers discussion and problem solving
Subject: Re: [AccessD] Primary Key Best Practices

On Wednesday 25 July 2007 02:52, jwcolby wrote:
> ROTFLMAO.
>
> The fit never passes.  It just subsides until the moon rises again.
>
>
> John W. Colby

Any table that does not have a natural primary key is not a pure dataset,
...


...


...

no, I'm not talking about natural v surrogate.


bruce
--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com

--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com

--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com




More information about the AccessD mailing list