Drew Wutka
DWUTKA at Marlow.com
Tue May 31 14:51:34 CDT 2011
Wow... this almost looks like an email from me to JWC... did you start drinking rational juice lately or something John? <VBG.... ducking and running> Drew -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of jwcolby Sent: Monday, May 30, 2011 8:12 PM To: Access Developers discussion and problem solving Subject: Re: [AccessD] 2 quick questions Jim, >There is nothing stated in any paper on the relational model that says a primary key cannot change. First, I highly doubt that you have read every paper on the relational model but I will just give you that one. No contest. Having said that, I am going to take a show of hands and ask: Does anyone on this list actually care about this statement? In order for your hand to be counted as up, you need to respond saying something like: Yes, papers on the relational model matter a great deal to me and I do care that nothing in any paper says... etc etc. Just for the record, my hand is not up. Just for the record, I don't remember anyone saying that it *can't* change, only that it causes enormous problems if it does change. Not the *DATA* changing, but the business rules defining the field or fields that define the PK. And it can cause enormous problems. Add one field to the required set of fields forming a natural PK and all hell breaks loose. Jim, you need to be having this conversation (relation model) with your academic friends. They will all agree and, with nothing else to say, you can give it a rest. I asked you for another name for the "autonumber surrogate key, better damned well not call it the PK around Jim" object so that we could refer to the "autonumber surrogate key, better damned well not call it the PK around Jim" object by whatever you desire so that we could avoid this absolutely silly conversation yet another time. So please, before you leave this thread, figure out what you want to call this "autonumber surrogate key, better damned well not call it the PK around Jim" object and tell us. We (*almost* everyone except yourself) call the "autonumber surrogate key, better damned well not call it the PK around Jim" object the PK because Microsoft calls it that. Click on the "autonumber surrogate key, better damned well not call it the PK around Jim" object and click the PK little golden key tool and once Microsoft makes the "autonumber surrogate key, better damned well not call it the PK around Jim" object the PK (they called it that), they then refer to that as the PK. We live and work in a Microsoft world. In SQL Server the exact same thing happens. Click the little golden key tool on the "autonumber surrogate key, better damned well not call it the PK around Jim" object and voila, MS calls it the PK. *YOU* are the one with the problem. *YOU* need to come up with another name for the "autonumber surrogate key, better damned well not call it the PK around Jim" object. Or better yet, just accept the fact that everyone except you and your professor friends call it the PK, and leave it alone. I for one am really, really *really* tired of the same old same old. And I for one am telling you, that since Microsoft calls it the PK, so do I. Given the nature of Wikipedia I am not a huge fan of quoting them however I am going to. http://en.wikipedia.org/wiki/Surrogate_key There are at least two definitions of a surrogate: Surrogate (1) - Hall, Owlett and Codd (1976) A surrogate represents an entity in the outside world. The surrogate is internally generated by the system but is nevertheless visible to the user or application. Surrogate (2) - Wieringa and De Jonge (1991) A surrogate represents an object in the database itself. The surrogate is internally generated by the system and is invisible to the user or application. http://datawarehouse.ittoolbox.com/documents/surrogate-key-18121 A surrogate key is a substitution for the natural primary key. It is just a unique identifier or number for each row that can be used for the primary key to the table. The only requirement for a surrogate primary key is that it is unique for each row in the table. http://databases.about.com/cs/administration/g/primarykey.htm http://www.agiledata.org/essays/keys.html 1. Common Key Terminology Let's start by describing some common terminology pertaining to keys and then work through an example. These terms are: * Key. A key is one or more data attributes that uniquely identify an entity. In a physical database a key would be formed of one or more table columns whose value(s) uniquely identifies a row within a relational table. * Composite key. A key that is composed of two or more attributes. * Natural key. A key that is formed of attributes that already exist in the real world. For example, U.S. citizens are issued a Social Security Number (SSN) that is unique to them (this isn't guaranteed to be true, but it's pretty darn close in practice). SSN could be used as a natural key, assuming privacy laws allow it, for a Person entity (assuming the scope of your organization is limited to the U.S.). * Surrogate key. A key with no business meaning. * Candidate key. An entity type in a logical data model will have zero or more candidate keys, also referred to simply as unique identifiers (note: some people don't believe in identifying candidate keys in LDMs, so there's no hard and fast rules). For example, if we only interact with American citizens then SSN is one candidate key for the Person entity type and the combination of name and phone number (assuming the combination is unique) is potentially a second candidate key. Both of these keys are called candidate keys because they are candidates to be chosen as the primary key, an alternate key or perhaps not even a key at all within a physical data model. * Primary key. The preferred key for an entity type. * Alternate key. Also known as a secondary key, is another unique identifier of a row within a table. * Foreign key. One or more attributes in an entity type that represents a key, either primary or secondary, in another entity type. I have to tell you Jim, that what I learned in the real world is precisely the above set of terms. Since every key uniquely identifies the row... The PK is nothing more than the most convenient Key, or the key with the fewest potential problems, actually selected by the database designer to represent the record. The word Primary (I assume) was selected because it is just that. As I understand it, the surrogate key did not exist in the (mathematical) relational model, and in fact, since it is not a real world property of the object being modeled, is verboten in the relational model. As if I care. The point Jim is that with all due respects to academia, we live in the real world and we end up with a vocabulary (and tools) that often conflicts with academia. Surrogate keys were invented because natural keys were clumsy and a royal PITA and it is entirely unnecessary to actually use a natural key as the PK (except in the mathematical relational model where surrogate keys are verboten). You may argue till the cows come home that if it weren't for reality we wouldn't have surrogate keys, and I will say "reality is, and is where *I* live!". Jim, you are tilting at windmills. And yes I have read the book from which that metaphor came, though I believe it is you that pointed out I still have to call myself uneducated! John W. Colby www.ColbyConsulting.com On 5/29/2011 8:18 PM, Jim Dettman wrote: > <<Primary Key = Nothing a user can use for anything and is totally not > dependant on data. For db system use only>> > > Pure hogwash. The very definition of a PK in the relational model is > exactly opposite of what you describe. > > John's problem is exactly what I tried to point out on this list a while > back; there is a fundamental difference between a primary key and a tag or a > pointer. John's problem is related to the meaning of the data itself; how > to determine that a row within the relation is unique. In other words, John > is looking for the primary key; that combination of attributes which > identifies a specific row and ensures that the row is not duplicated. > > That is not the same thing as adding a unique tag or pointer on every row > (an auto number). And despite popular belief, a auto number in relational > terms is not a surrogate key. > > John needs to come up with a key based on the data that will ensure that a > row is unique. If the field(s) used to do that cannot ensure that, then > more fields (attributes) need to be added to the table (relation). > > For *performance* reasons in the DB, he will use an auto number as a > pointer (like all of us do), but in addition to that, he will need to create > a unique index based on the PK. > > He might take the shortcut of making that index a hash, which for > performance reasons may be required, but it's a bad idea because it can lead > to an update abnormalities. If you can live with that, great and at the end > of the day, it may be required to get the job done. > > Since it's only one index on the main table, I wouldn't take that > shortcut. > > And as far as the primary key not being able to change (as in a name > change or SS#), again total hogwash. That's why relational DB's allow for > cascading updates. There is nothing stated in any paper on the relational > model that says a primary key cannot change. > > Jim. -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com The information contained in this transmission is intended only for the person or entity to which it is addressed and may contain II-VI Proprietary and/or II-VI Business Sensitive material. If you are not the intended recipient, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. You are notified that any review, retransmission, copying, disclosure, dissemination, or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited.