[AccessD] 2 quick questions

Tue May 31 14:51:34 CDT 2011

Wow... this almost looks like an email from me to JWC... did you start
drinking rational juice lately or something John?

<VBG.... ducking and running>

Drew

-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of jwcolby
Sent: Monday, May 30, 2011 8:12 PM
To: Access Developers discussion and problem solving
Subject: Re: [AccessD] 2 quick questions

Jim,

 >There is nothing stated in any paper on the relational model that says
a primary key cannot change.

First, I highly doubt that you have read every paper on the relational
model but I will just give 
you that one.  No contest.

Having said that, I am going to take a show of hands and ask:

Does anyone on this list actually care about this statement?  In order
for your hand to be counted 
as up, you need to respond saying something like:

Yes, papers on the relational model matter a great deal to me and I do
care that nothing in any 
paper says... etc etc.

Just for the record, my hand is not up.

Just for the record, I don't remember anyone saying that it *can't*
change, only that it causes 
enormous problems if it does change.  Not the *DATA* changing, but the
business rules defining the 
field or fields that define the PK.  And it can cause enormous problems.
Add one field to the 
required set of fields forming a natural PK and all hell breaks loose.

Jim, you need to be having this conversation (relation model) with your
academic friends.  They will 
all agree and, with nothing else to say, you can give it a rest.

I asked you for another name for the "autonumber surrogate key, better
damned well not call it the 
PK around Jim" object so that we could refer to the "autonumber
surrogate key, better damned well 
not call it the PK around Jim" object by whatever you desire so that we
could avoid this absolutely 
silly conversation yet another time.

So please, before you leave this thread, figure out what you want to
call this "autonumber surrogate 
key, better damned well not call it the PK around Jim" object and tell
us.

We (*almost* everyone except yourself) call the "autonumber surrogate
key, better damned well not 
call it the PK around Jim" object the PK because Microsoft calls it
that.  Click on the "autonumber 
surrogate key, better damned well not call it the PK around Jim" object
and click the PK little 
golden key tool and once Microsoft makes the "autonumber surrogate key,
better damned well not call 
it the PK around Jim" object the PK (they called it that), they then
refer to that as the PK.

We live and work in a Microsoft world.  In SQL Server the exact same
thing happens.  Click the 
little golden key tool on the "autonumber surrogate key, better damned
well not call it the PK 
around Jim" object and voila, MS calls it the PK.

*YOU* are the one with the problem.  *YOU* need to come up with another
name for the "autonumber 
surrogate key, better damned well not call it the PK around Jim" object.

Or better yet, just accept the fact that everyone except you and your
professor friends call it the 
PK, and leave it alone.

I for one am really, really *really* tired of the same old same old.

And I for one am telling you, that since Microsoft calls it the PK, so
do I.

Given the nature of Wikipedia I am not a huge fan of quoting them
however I am going to.

http://en.wikipedia.org/wiki/Surrogate_key

There are at least two definitions of a surrogate:

Surrogate (1) - Hall, Owlett and Codd (1976)
     A surrogate represents an entity in the outside world. The
surrogate is internally generated by 
the system but is nevertheless visible to the user or application.
Surrogate (2) - Wieringa and De Jonge (1991)
     A surrogate represents an object in the database itself. The
surrogate is internally generated 
by the system and is invisible to the user or application.

http://datawarehouse.ittoolbox.com/documents/surrogate-key-18121

A surrogate key is a substitution for the natural primary key.

It is just a unique identifier or number for each row that can be used
for the primary key to the 
table. The only requirement for a surrogate primary key is that it is
unique for each row in the table.

http://databases.about.com/cs/administration/g/primarykey.htm
http://www.agiledata.org/essays/keys.html

1. Common Key Terminology

Let's start by describing some common terminology pertaining to keys and
then work through an 
example.  These terms are:

     *      Key. A key is one or more data attributes that uniquely
identify an entity.  In a 
physical database a key would be formed of one or more table columns
whose value(s) uniquely 
identifies a row within a relational table.
     *      Composite key.  A key that is composed of two or more
attributes.
     *      Natural key.  A key that is formed of attributes that
already exist in the real world. 
For example, U.S. citizens are issued a Social Security Number (SSN)
that is unique to them (this 
isn't guaranteed to be true, but it's pretty darn close in practice).
SSN could be used as a 
natural key, assuming privacy laws allow it, for a Person entity
(assuming the scope of your 
organization is limited to the U.S.).
     *      Surrogate key.  A key with no business meaning.
     *      Candidate key.  An entity type in a logical data model will
have zero or more candidate 
keys, also referred to simply as unique identifiers (note: some people
don't believe in identifying 
candidate keys in LDMs, so there's no hard and fast rules).  For
example, if we only interact with 
American citizens then SSN is one candidate key for the Person entity
type and the combination of 
name and phone number (assuming the combination is unique) is
potentially a second candidate key. 
Both of these keys are called candidate keys because they are candidates
to be chosen as the primary 
key, an alternate key  or perhaps not even a key at all within a
physical data model.
     *      Primary key.  The preferred key for an entity type.
     *      Alternate key. Also known as a secondary key, is another
unique identifier of a row 
within a table.
     *      Foreign key. One or more attributes in an entity type that
represents a key, either 
primary or secondary, in another entity type.

I have to tell you Jim, that what I learned in the real world is
precisely the above set of terms.

Since every key uniquely identifies the row...

The PK is nothing more than the most convenient Key, or the key with the
fewest potential problems, 
actually selected by the database designer to represent the record.  The
word Primary (I assume) was 
selected because it is just that.

As I understand it, the surrogate key did not exist in the
(mathematical) relational model, and in 
fact, since it is not a real world property of the object being modeled,
is verboten in the 
relational model.

As if I care.

The point Jim is that with all due respects to academia, we live in the
real world and we end up 
with a vocabulary (and tools) that often conflicts with academia.
Surrogate keys were invented 
because natural keys were clumsy and a royal PITA and it is entirely
unnecessary to actually use a 
natural key as the PK (except in the mathematical relational model where
surrogate keys are verboten).

You may argue till the cows come home that if it weren't for reality we
wouldn't have surrogate 
keys, and I will say "reality is, and is where *I* live!".

Jim, you are tilting at windmills.  And yes I have read the book from
which that metaphor came, 
though I believe it is you that pointed out I still have to call myself
uneducated!

John W. Colby
www.ColbyConsulting.com

On 5/29/2011 8:18 PM, Jim Dettman wrote:
> <<Primary Key = Nothing a user can use for anything and is totally not
> dependant on data. For db system use only>>
>
>    Pure hogwash.  The very definition of a PK in the relational model
is
> exactly opposite of what you describe.
>
>    John's problem is exactly what I tried to point out on this list a
while
> back; there is a fundamental difference between a primary key and a
tag or a
> pointer.  John's problem is related to the meaning of the data itself;
how
> to determine that a row within the relation is unique.  In other
words, John
> is looking for the primary key; that combination of attributes which
> identifies a specific row and ensures that the row is not duplicated.
>
>    That is not the same thing as adding a unique tag or pointer on
every row
> (an auto number).  And despite popular belief, a auto number in
relational
> terms is not a surrogate key.
>
>    John needs to come up with a key based on the data that will ensure
that a
> row is unique.  If the field(s) used to do that cannot ensure that,
then
> more fields (attributes) need to be added to the table (relation).
>
>    For *performance* reasons in the DB, he will use an auto number as
a
> pointer (like all of us do), but in addition to that, he will need to
create
> a unique index based on the PK.
>
>    He might take the shortcut of making that index a hash, which for
> performance reasons may be required, but it's a bad idea because it
can lead
> to an update abnormalities.  If you can live with that, great and at
the end
> of the day, it may be required to get the job done.
>
>    Since it's only one index on the main table, I wouldn't take that
> shortcut.
>
>    And as far as the primary key not being able to change (as in a
name
> change or SS#), again total hogwash.  That's why relational DB's allow
for
> cascading updates.  There is nothing stated in any paper on the
relational
> model that says a primary key cannot change.
>
> Jim.
-- 
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com
The information contained in this transmission is intended only for the person or entity 
to which it is addressed and may contain II-VI Proprietary and/or II-VI Business 
Sensitive material. If you are not the intended recipient, please contact the sender 
immediately and destroy the material in its entirety, whether electronic or hard copy. 
You are notified that any review, retransmission, copying, disclosure, dissemination, 
or other use of, or taking of any action in reliance upon this information by persons 
or entities other than the intended recipient is prohibited.