[AccessD] [Spam]10.00 Re: 2 quick questions

Wed Jun 1 11:50:48 CDT 2011

Hurray!

-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of jwcolby
Sent: Monday, May 30, 2011 6:12 PM
To: Access Developers discussion and problem solving
Subject: [Spam]10.00 Re: [AccessD] 2 quick questions

Jim,

 >There is nothing stated in any paper on the relational model that says a
primary key cannot change.

First, I highly doubt that you have read every paper on the relational model
but I will just give 
you that one.  No contest.

Having said that, I am going to take a show of hands and ask:

Does anyone on this list actually care about this statement?  In order for
your hand to be counted 
as up, you need to respond saying something like:

Yes, papers on the relational model matter a great deal to me and I do care
that nothing in any 
paper says... etc etc.

Just for the record, my hand is not up.

Just for the record, I don't remember anyone saying that it *can't* change,
only that it causes 
enormous problems if it does change.  Not the *DATA* changing, but the
business rules defining the 
field or fields that define the PK.  And it can cause enormous problems.
Add one field to the 
required set of fields forming a natural PK and all hell breaks loose.

Jim, you need to be having this conversation (relation model) with your
academic friends.  They will 
all agree and, with nothing else to say, you can give it a rest.

I asked you for another name for the "autonumber surrogate key, better
damned well not call it the 
PK around Jim" object so that we could refer to the "autonumber surrogate
key, better damned well 
not call it the PK around Jim" object by whatever you desire so that we
could avoid this absolutely 
silly conversation yet another time.

So please, before you leave this thread, figure out what you want to call
this "autonumber surrogate 
key, better damned well not call it the PK around Jim" object and tell us.

We (*almost* everyone except yourself) call the "autonumber surrogate key,
better damned well not 
call it the PK around Jim" object the PK because Microsoft calls it that.
Click on the "autonumber 
surrogate key, better damned well not call it the PK around Jim" object and
click the PK little 
golden key tool and once Microsoft makes the "autonumber surrogate key,
better damned well not call 
it the PK around Jim" object the PK (they called it that), they then refer
to that as the PK.

We live and work in a Microsoft world.  In SQL Server the exact same thing
happens.  Click the 
little golden key tool on the "autonumber surrogate key, better damned well
not call it the PK 
around Jim" object and voila, MS calls it the PK.

*YOU* are the one with the problem.  *YOU* need to come up with another name
for the "autonumber 
surrogate key, better damned well not call it the PK around Jim" object.

Or better yet, just accept the fact that everyone except you and your
professor friends call it the 
PK, and leave it alone.

I for one am really, really *really* tired of the same old same old.

And I for one am telling you, that since Microsoft calls it the PK, so do I.

Given the nature of Wikipedia I am not a huge fan of quoting them however I
am going to.

http://en.wikipedia.org/wiki/Surrogate_key

There are at least two definitions of a surrogate:

Surrogate (1) - Hall, Owlett and Codd (1976)
     A surrogate represents an entity in the outside world. The surrogate is
internally generated by 
the system but is nevertheless visible to the user or application.
Surrogate (2) - Wieringa and De Jonge (1991)
     A surrogate represents an object in the database itself. The surrogate
is internally generated 
by the system and is invisible to the user or application.

http://datawarehouse.ittoolbox.com/documents/surrogate-key-18121

A surrogate key is a substitution for the natural primary key.

It is just a unique identifier or number for each row that can be used for
the primary key to the 
table. The only requirement for a surrogate primary key is that it is unique
for each row in the table.

http://databases.about.com/cs/administration/g/primarykey.htm
http://www.agiledata.org/essays/keys.html

1. Common Key Terminology

Let's start by describing some common terminology pertaining to keys and
then work through an 
example.  These terms are:

     *      Key. A key is one or more data attributes that uniquely identify
an entity.  In a 
physical database a key would be formed of one or more table columns whose
value(s) uniquely 
identifies a row within a relational table.
     *      Composite key.  A key that is composed of two or more
attributes.
     *      Natural key.  A key that is formed of attributes that already
exist in the real world. 
For example, U.S. citizens are issued a Social Security Number (SSN)  that
is unique to them (this 
isn't guaranteed to be true, but it's pretty darn close in practice).  SSN
could be used as a 
natural key, assuming privacy laws allow it, for a Person entity (assuming
the scope of your 
organization is limited to the U.S.).
     *      Surrogate key.  A key with no business meaning.
     *      Candidate key.  An entity type in a logical data model will have
zero or more candidate 
keys, also referred to simply as unique identifiers (note: some people don't
believe in identifying 
candidate keys in LDMs, so there's no hard and fast rules).  For example, if
we only interact with 
American citizens then SSN is one candidate key for the Person entity type
and the combination of 
name and phone number (assuming the combination is unique) is potentially a
second candidate key. 
Both of these keys are called candidate keys because they are candidates to
be chosen as the primary 
key, an alternate key  or perhaps not even a key at all within a physical
data model.
     *      Primary key.  The preferred key for an entity type.
     *      Alternate key. Also known as a secondary key, is another unique
identifier of a row 
within a table.
     *      Foreign key. One or more attributes in an entity type that
represents a key, either 
primary or secondary, in another entity type.

I have to tell you Jim, that what I learned in the real world is precisely
the above set of terms.

Since every key uniquely identifies the row...

The PK is nothing more than the most convenient Key, or the key with the
fewest potential problems, 
actually selected by the database designer to represent the record.  The
word Primary (I assume) was 
selected because it is just that.

As I understand it, the surrogate key did not exist in the (mathematical)
relational model, and in 
fact, since it is not a real world property of the object being modeled, is
verboten in the 
relational model.

As if I care.

The point Jim is that with all due respects to academia, we live in the real
world and we end up 
with a vocabulary (and tools) that often conflicts with academia.  Surrogate
keys were invented 
because natural keys were clumsy and a royal PITA and it is entirely
unnecessary to actually use a 
natural key as the PK (except in the mathematical relational model where
surrogate keys are verboten).

You may argue till the cows come home that if it weren't for reality we
wouldn't have surrogate 
keys, and I will say "reality is, and is where *I* live!".

Jim, you are tilting at windmills.  And yes I have read the book from which
that metaphor came, 
though I believe it is you that pointed out I still have to call myself
uneducated!

John W. Colby
www.ColbyConsulting.com

On 5/29/2011 8:18 PM, Jim Dettman wrote:
> <<Primary Key = Nothing a user can use for anything and is totally not
> dependant on data. For db system use only>>
>
>    Pure hogwash.  The very definition of a PK in the relational model is
> exactly opposite of what you describe.
>
>    John's problem is exactly what I tried to point out on this list a
while
> back; there is a fundamental difference between a primary key and a tag or
a
> pointer.  John's problem is related to the meaning of the data itself; how
> to determine that a row within the relation is unique.  In other words,
John
> is looking for the primary key; that combination of attributes which
> identifies a specific row and ensures that the row is not duplicated.
>
>    That is not the same thing as adding a unique tag or pointer on every
row
> (an auto number).  And despite popular belief, a auto number in relational
> terms is not a surrogate key.
>
>    John needs to come up with a key based on the data that will ensure
that a
> row is unique.  If the field(s) used to do that cannot ensure that, then
> more fields (attributes) need to be added to the table (relation).
>
>    For *performance* reasons in the DB, he will use an auto number as a
> pointer (like all of us do), but in addition to that, he will need to
create
> a unique index based on the PK.
>
>    He might take the shortcut of making that index a hash, which for
> performance reasons may be required, but it's a bad idea because it can
lead
> to an update abnormalities.  If you can live with that, great and at the
end
> of the day, it may be required to get the job done.
>
>    Since it's only one index on the main table, I wouldn't take that
> shortcut.
>
>    And as far as the primary key not being able to change (as in a name
> change or SS#), again total hogwash.  That's why relational DB's allow for
> cascading updates.  There is nothing stated in any paper on the relational
> model that says a primary key cannot change.
>
> Jim.
-- 
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com