jwcolby
jwcolby at colbyconsulting.com
Tue May 31 15:35:13 CDT 2011
> In other words, I was questioning myself. LOL, nice save. ;) John W. Colby www.ColbyConsulting.com On 5/31/2011 4:06 PM, Jim Dettman wrote: > John, > > My point and my only point was that the problem John was facing exactly > illustrated the differences of a true Pk and what everyone calls a PK. > > In regards to surrogates, the dictionary definition of a "surrogate" means > "to take the place of". > > If John adds an auto number to his table, is he not still faced with the > exact same problem; how to distinguish one patient from another? Yes. > Adding an auto number does nothing to solve that problem. If that auto > number is called "PatientID" and is given to the person, is it now a > surrogate? Yes. It has meaning and can no longer be reassigned at will. > That is a surrogate key and can function as a PK. > > <<Jim, you are tilting at windmills. And yes I have read the book from > which that metaphor came, > though I believe it is you that pointed out I still have to call myself > uneducated!>> > > Boy, you missed the point of that question (it was not a statement). I > wondered "which one of us uneducated?". The implication was that: > > 1. Either you because you care less about the theory and simply do things > "because it works". > > 2. Or myself because I care about the theory even though at the end of the > day, I still live in the real world and end up doing the same things as you > do. > > > In other words, I was questioning myself. > > Jim. > > > > -----Original Message----- > From: accessd-bounces at databaseadvisors.com > [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of jwcolby > Sent: Monday, May 30, 2011 09:12 PM > To: Access Developers discussion and problem solving > Subject: Re: [AccessD] 2 quick questions > > Jim, > > >There is nothing stated in any paper on the relational model that says a > primary key cannot change. > > First, I highly doubt that you have read every paper on the relational model > but I will just give > you that one. No contest. > > Having said that, I am going to take a show of hands and ask: > > Does anyone on this list actually care about this statement? In order for > your hand to be counted > as up, you need to respond saying something like: > > Yes, papers on the relational model matter a great deal to me and I do care > that nothing in any > paper says... etc etc. > > Just for the record, my hand is not up. > > Just for the record, I don't remember anyone saying that it *can't* change, > only that it causes > enormous problems if it does change. Not the *DATA* changing, but the > business rules defining the > field or fields that define the PK. And it can cause enormous problems. > Add one field to the > required set of fields forming a natural PK and all hell breaks loose. > > Jim, you need to be having this conversation (relation model) with your > academic friends. They will > all agree and, with nothing else to say, you can give it a rest. > > I asked you for another name for the "autonumber surrogate key, better > damned well not call it the > PK around Jim" object so that we could refer to the "autonumber surrogate > key, better damned well > not call it the PK around Jim" object by whatever you desire so that we > could avoid this absolutely > silly conversation yet another time. > > So please, before you leave this thread, figure out what you want to call > this "autonumber surrogate > key, better damned well not call it the PK around Jim" object and tell us. > > We (*almost* everyone except yourself) call the "autonumber surrogate key, > better damned well not > call it the PK around Jim" object the PK because Microsoft calls it that. > Click on the "autonumber > surrogate key, better damned well not call it the PK around Jim" object and > click the PK little > golden key tool and once Microsoft makes the "autonumber surrogate key, > better damned well not call > it the PK around Jim" object the PK (they called it that), they then refer > to that as the PK. > > We live and work in a Microsoft world. In SQL Server the exact same thing > happens. Click the > little golden key tool on the "autonumber surrogate key, better damned well > not call it the PK > around Jim" object and voila, MS calls it the PK. > > *YOU* are the one with the problem. *YOU* need to come up with another name > for the "autonumber > surrogate key, better damned well not call it the PK around Jim" object. > > Or better yet, just accept the fact that everyone except you and your > professor friends call it the > PK, and leave it alone. > > I for one am really, really *really* tired of the same old same old. > > And I for one am telling you, that since Microsoft calls it the PK, so do I. > > Given the nature of Wikipedia I am not a huge fan of quoting them however I > am going to. > > http://en.wikipedia.org/wiki/Surrogate_key > > There are at least two definitions of a surrogate: > > Surrogate (1) - Hall, Owlett and Codd (1976) > A surrogate represents an entity in the outside world. The surrogate is > internally generated by > the system but is nevertheless visible to the user or application. > Surrogate (2) - Wieringa and De Jonge (1991) > A surrogate represents an object in the database itself. The surrogate > is internally generated > by the system and is invisible to the user or application. > > http://datawarehouse.ittoolbox.com/documents/surrogate-key-18121 > > A surrogate key is a substitution for the natural primary key. > > It is just a unique identifier or number for each row that can be used for > the primary key to the > table. The only requirement for a surrogate primary key is that it is unique > for each row in the table. > > http://databases.about.com/cs/administration/g/primarykey.htm > http://www.agiledata.org/essays/keys.html > > 1. Common Key Terminology > > Let's start by describing some common terminology pertaining to keys and > then work through an > example. These terms are: > > * Key. A key is one or more data attributes that uniquely identify > an entity. In a > physical database a key would be formed of one or more table columns whose > value(s) uniquely > identifies a row within a relational table. > * Composite key. A key that is composed of two or more > attributes. > * Natural key. A key that is formed of attributes that already > exist in the real world. > For example, U.S. citizens are issued a Social Security Number (SSN) that > is unique to them (this > isn't guaranteed to be true, but it's pretty darn close in practice). SSN > could be used as a > natural key, assuming privacy laws allow it, for a Person entity (assuming > the scope of your > organization is limited to the U.S.). > * Surrogate key. A key with no business meaning. > * Candidate key. An entity type in a logical data model will have > zero or more candidate > keys, also referred to simply as unique identifiers (note: some people don't > believe in identifying > candidate keys in LDMs, so there's no hard and fast rules). For example, if > we only interact with > American citizens then SSN is one candidate key for the Person entity type > and the combination of > name and phone number (assuming the combination is unique) is potentially a > second candidate key. > Both of these keys are called candidate keys because they are candidates to > be chosen as the primary > key, an alternate key or perhaps not even a key at all within a physical > data model. > * Primary key. The preferred key for an entity type. > * Alternate key. Also known as a secondary key, is another unique > identifier of a row > within a table. > * Foreign key. One or more attributes in an entity type that > represents a key, either > primary or secondary, in another entity type. > > I have to tell you Jim, that what I learned in the real world is precisely > the above set of terms. > > Since every key uniquely identifies the row... > > The PK is nothing more than the most convenient Key, or the key with the > fewest potential problems, > actually selected by the database designer to represent the record. The > word Primary (I assume) was > selected because it is just that. > > As I understand it, the surrogate key did not exist in the (mathematical) > relational model, and in > fact, since it is not a real world property of the object being modeled, is > verboten in the > relational model. > > As if I care. > > The point Jim is that with all due respects to academia, we live in the real > world and we end up > with a vocabulary (and tools) that often conflicts with academia. Surrogate > keys were invented > because natural keys were clumsy and a royal PITA and it is entirely > unnecessary to actually use a > natural key as the PK (except in the mathematical relational model where > surrogate keys are verboten). > > You may argue till the cows come home that if it weren't for reality we > wouldn't have surrogate > keys, and I will say "reality is, and is where *I* live!". > > Jim, you are tilting at windmills. And yes I have read the book from which > that metaphor came, > though I believe it is you that pointed out I still have to call myself > uneducated! > > John W. Colby > www.ColbyConsulting.com > > On 5/29/2011 8:18 PM, Jim Dettman wrote: >> <<Primary Key = Nothing a user can use for anything and is totally not >> dependant on data. For db system use only>> >> >> Pure hogwash. The very definition of a PK in the relational model is >> exactly opposite of what you describe. >> >> John's problem is exactly what I tried to point out on this list a > while >> back; there is a fundamental difference between a primary key and a tag or > a >> pointer. John's problem is related to the meaning of the data itself; how >> to determine that a row within the relation is unique. In other words, > John >> is looking for the primary key; that combination of attributes which >> identifies a specific row and ensures that the row is not duplicated. >> >> That is not the same thing as adding a unique tag or pointer on every > row >> (an auto number). And despite popular belief, a auto number in relational >> terms is not a surrogate key. >> >> John needs to come up with a key based on the data that will ensure > that a >> row is unique. If the field(s) used to do that cannot ensure that, then >> more fields (attributes) need to be added to the table (relation). >> >> For *performance* reasons in the DB, he will use an auto number as a >> pointer (like all of us do), but in addition to that, he will need to > create >> a unique index based on the PK. >> >> He might take the shortcut of making that index a hash, which for >> performance reasons may be required, but it's a bad idea because it can > lead >> to an update abnormalities. If you can live with that, great and at the > end >> of the day, it may be required to get the job done. >> >> Since it's only one index on the main table, I wouldn't take that >> shortcut. >> >> And as far as the primary key not being able to change (as in a name >> change or SS#), again total hogwash. That's why relational DB's allow for >> cascading updates. There is nothing stated in any paper on the relational >> model that says a primary key cannot change. >> >> Jim.