Jim Dettman
jimdettman at verizon.net
Wed Jul 25 11:44:17 CDT 2007
John, <<You accept that a PK can be a surrogate, you state that a PK is nothing but a pointer, therefore you accept (and state) that a PK is nothing but a pointer.>> No I don't. A surrogate key is not a primary key in terms of the model. It has no relation to the data what so ever. I agree it's a pointer, but nothing more then that. <<AND... from the perspective of the child, a FK is nothing but a pointer. Does a table HAVE TO HAVE a PK at all? Not if it has no children. >> Really? I don't have to have any method of ensuring that each row in a table is unique? <<And I don't give a rat's patuty about relational theory,>> Yeah, I know you don't as we have been down this road a few times before<g> (something about pig's teats comes to mind). Personally I like to fundamentally understand something as much as possible. It helps me apply things to other areas. <<You are nitpicking to death stuff that simply does not matter. If you want to insist that the PK HAS TO BE a natural key, then how can you say in the next breath that "I don't object to the use of surrogate keys">> Because a "primary key" and a "key" don't mean the same things to me. One denotes something that is connected to the data. The other could be anything. As for using a surrogate key, I don't object to them and I use them all the time. But I think it's important to understand that they are a short cut. In this case, the benefits are many and the ramifications few. Does it matter? In a day to day context no. But in understanding how data can be modeled certainly yes. Jim. -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of jwcolby Sent: Wednesday, July 25, 2007 11:19 AM To: 'Access Developers discussion and problem solving' Subject: Re: [AccessD] Primary Key Best Practices Jim, >Surrogate keys are nothing more then pointers. Correct!!!!!!!!!!!!!!!!!!!!!!!!!! <IMPORTANT> This is a logical construct Jim. You accept that a PK can be a surrogate, you state that a PK is nothing but a pointer, therefore you accept (and state) that a PK is nothing but a pointer. <IMPORTANT/> AND... from the perspective of the child, a FK is nothing but a pointer. Does a table HAVE TO HAVE a PK at all? Not if it has no children. Ergo, what matters from the child's perspective is what the PK is all about, and that is simply a POINTER back to the parent. AND... Since a surrogate key functions as a PK, then... A PK must be nothing but a pointer... > Don' agree with "never". You want it as stable as possible, but there is nothing in relational theory that says it can never change. And I don't give a rat's patuty about relational theory, I care about what works. Relational theory says you should normalize to 16 (IIRC) different levels. When was the last time you did that? Do you even UNDERSTAND 9th (or 12th, or 16th) normal form? OK YOU probably do but I don't and don't care that I don't. So day in and day out you (or at least I) ignore relational theory and yet you attempt to flaunt that same theory in the great PK debate for what purpose??? Having the PK change is a no-no. Sorry, it just is. It causes a whole host of issues and given that it doesn't HAVE to change.... >That is simply not true. With relational theory, the PK is derived from the pool of CK's, which is based on the data. And I don't give a rat's patuty about relational theory, I care about what works. >Surrogates only work in computer systems because we have systems that are reliable. But we do, so what's your point? Natural keys only work (in large systems) because we have systems that are reliable. Imagine trying to manually correct 100 million records where something got corrupted. No easier (without a reliable system). I have NEVER used paper and pencil in a database so who really cares. >To call them a PK in regards to the relational model is simply incorrect. OK, and who cares? I simply do not care about the relational model (in this debate), I care about what works efficiently. I use what works (of the model) and I abandon what is inefficient (natural keys). Natural keys are NOT A REQUIREMENT, you admit that yourself. I still have a PK, it is just a surrogate. If you accept that surrogates do work then why does the insistence in relational model PK being a natural key even matter? You are nitpicking to death stuff that simply does not matter. If you want to insist that the PK HAS TO BE a natural key, then how can you say in the next breath that "I don't object to the use of surrogate keys" C'mon Jim, give it up. I will sign a piece of paper stating whatever you want about PKs and natural keys and relational theory, I simply DO NOT CARE. What I care about is that they are inefficient and cause problems that surrogate keys avoid. Happy? John W. Colby Colby Consulting www.ColbyConsulting.com -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Jim Dettman Sent: Wednesday, July 25, 2007 10:42 AM To: 'Access Developers discussion and problem solving' Subject: Re: [AccessD] Primary Key Best Practices John, <<1) The PK should be stable (NEVER changes). >> Don' agree with "never". You want it as stable as possible, but there is nothing in relational theory that says it can never change. <<Again, the PK and the unique index are NOT the same structure, either in real life or in theory.>> That is simply not true. With relational theory, the PK is derived from the pool of CK's, which is based on the data. In theory you cannot have a PK without it being a unique index. That's part of the definition of what a natural PK is. Out of the pool of CK's, you would use the CK that is the most stable, shortest in length, and the least complex (simple vs compound). <<What works, and what is the best choice are different matters. The surrogate key avoids a whole slew of real life problems, and creates none (in my experience).>> Surrogates only work in computer systems because we have systems that are reliable. If I gave you pen and paper and had you keep track of data using the relational model, you'd be in a pretty big mess fairly quick if you used surrogate keys. I don't object to the use of surrogate keys, but what I do object to is the apples and oranges approach you use to claim what a PK is and surrogate keys fit in the relational model. Surrogate keys are nothing more then pointers. To call them a PK in regards to the relational model is simply incorrect. Surrogate keys work well in computer systems, but the relational model can be applied to much more then computer systems and the PK as you describe it vs what it is in the relational model are not the same thing. Jim. -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of jwcolby Sent: Wednesday, July 25, 2007 9:18 AM To: 'Access Developers discussion and problem solving' Subject: Re: [AccessD] Primary Key Best Practices Bruce, >Any table that does not have a natural primary key is not a pure >dataset, ... No, Any table that does not have a natural primary key CANDIDATE is not a pure dataset, ... A candidate key is exactly that, a field or set of fields that can serve the purpose of uniquely identifying the data record, and thus becomes a CANDIDATE to be the PK. In fact you can have multiple candidates. Nothing says you actually have to USE the candidate as the PK. That is again the difference between the unique index and the PK. One of the CANDIDATES is used as the unique index, but a surrogate can still be used as the PK. Take an employee table. ONE candidate key could be first / last / phone, another could be an employee number (a quasi-surrogate key by the way), another could be the SSN. Thus we have (at least) THREE CANDIDATE keys. Do you have to use all of them as the PK? Obviously not. Do you have to use ANY of them as the PK? Obviously not. Throw in a true surrogate key - an autoincrement - and you can use that as well. So which you use simply doesn't matter AS LONG AS the PK meets a few criteria. 1) The PK should be stable (NEVER changes). That eliminates the first / last / phone idea because the name could change (I got married, I got tired of my name) AND the phone number could (and probably will) change. The SSN fails for the same reason. SSNs CAN (believe it or not) change. Alien workers steal SSNs all the time. Now they become a citizen... Oooops the SSN has to change. 2) Has no meaning. From the perspective of the CHILD record, the FK is nothing more than a pointer back to the parent record. As such it should be a behind the scenes, never directly viewed artificial construct designed to do its job as efficiently as possible. In fact, a surrogate key can (and DOES) uniquely identify a single row of the table. If it did not, then it could not be the PK. As you know, you cannot take a simple integer, make it a value of 1 for every record, and make that the PK of the table. It is the very fact that you make the field autoincrement (or whatever) is what makes it possible to serve as the PK. However a surrogate cannot be used for the unique index, the purpose of preventing duplicates. OTOH it does not need to, its sole purpose in live is to act as a pointer, placed in child records to point back to itself. Again, the PK and the unique index are NOT the same structure, either in real life or in theory. The purpose of the PK is to uniquely identify a record so that it can be linked to a child record. Nothing more, nothing less. You can do that with a natural key, and you can do that with a surrogate. The purpose of the unique index is to prevent entering the same record in the database twice (data integrity). Different function. The fact that the fields used in the unique index can also serve as a PK is irrelevant. It does NOT HAVE TO BE the primary key. Think about normalization for a moment. The basic concept behind normalization is that ONLY information about a specific object is in the table about that object. Only bank info is in the bank table, only customer info is in the customer table, only check info is in the check table. One goal of normalization should be to minimize the fluff around lineage. Yes a check was drawn by a specific customer, but that does not mean that we have to have any specific piece of the customer table in the check table, it only means that we have to be able to uniquely identify which customer wrote the check. So it simply does NOT MATTER whether we use a SURROGATE KEY from the customer table, or the customer SSN (the very WORST choice), the first /last / eye color / hair color / phone / zip / and whatever else you might have decided was the CANDIDATE KEY of the customer table. Either one WORKS. What works, and what is the best choice are different matters. The surrogate key avoids a whole slew of real life problems, and creates none (in my experience). John W. Colby Colby Consulting www.ColbyConsulting.com -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Bruce Bruen Sent: Wednesday, July 25, 2007 8:12 AM To: Access Developers discussion and problem solving Subject: Re: [AccessD] Primary Key Best Practices On Wednesday 25 July 2007 02:52, jwcolby wrote: > ROTFLMAO. > > The fit never passes. It just subsides until the moon rises again. > > > John W. Colby Any table that does not have a natural primary key is not a pure dataset, ... ... ... no, I'm not talking about natural v surrogate. bruce -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com