[dba-SQLServer] Clustered Index Debate - Was: RE: Primary keys and entry points

Fri Jun 26 12:05:03 CDT 2009

>> copy from AccessD:
---

Hi Shamil

Yes, let's keep it here for a while - at least I cannot provide much more useful input as I don't have experience with SQL Server and large databases.

As you mention, one could choose to let the ADO.NET EF do the work and let it design the database and leave to it the decisions for building indexes and keys. However, I'm not sure I'm prepared to go that far until the EF is a bit more mature.

Thanks for all the links. I'll study them at a later occasion, time is too limited currently.

/gustav

>>> shamil at smsconsulting.spb.ru 26-06-2009 17:10 >>>
Hi Gustav,

Yes, SQL Server table can have no more than one clustered index (see links
below).
And "if the table has a clustered index, or the index is on an indexed view,
the row locator is the clustered index key for the row."
The latter means (AFAIU feel free to correct me if you suppose I'm wrong)
that if we introduce surrogate key additionally to clustered index created
on natural (primary key) columns and if we also create an index on surrogate
key column (to e.g. speed-up joins execution one may suppose) then clustered
index will anyway be involved in sql selection operations to lookup actual
data rows - and that will result in joins of large data tables using
surrogate keys would run (considerably) slower than joins on tables without
surrogate keys. And not even joins: if you have a surrogate
Autonumber/Identity clustered primary key/index on a table (that's what we
usually have most of the time currenlty?) plus a natural (compound) (unique)
key/index then using where expression on the columns from the natural
key/index will involve two indexes on sql execution time, and if result set
is large then execution time will be bad...

Well, the above doesn't yet sound clear for me therefore I'd propose to have
this discussion kept here for some time to find all "pros" and contra" for
this or that approach. As e.g. it's done here on "the GUIDs as primary key"
subject:

"Is it a bad idea to use GUIDs as primary keys in MS SQL?"
http://stackoverflow.com/questions/537145/is-it-a-bad-idea-to-use-guids-as-primary-keys-in-ms-sql

<<<<<<<<<<<<<<<<<<<<
GUID Pros
* Unique across every table, every database, every server
* Allows easy merging of records from different databases
* Allows easy distribution of databases across multiple servers
* You can generate IDs anywhere, instead of having to roundtrip to the database
* Most replication scenarios require GUID columns anyway 

GUID Cons

* It is a whopping 4 times larger than the traditional 4-byte index value;
this can have serious performance and storage implications if you're not careful
* Cumbersome to debug (where userid='{BAE7DF4-DDF-3RG-5TY3E3RF456AS10}')
* The generated GUIDs should be partially sequential for best performance...
>>>>>>>>>>>>>>>>>>>

I mean I expect this thread will finally come to similar "Pros" and "Cons"
list - in fact that would be two lists:

1)
Natural Primary Keys with Clustered Index and No Surrogate Keys - Pros

Natural Primary Keys with Clustered Index and No Surrogate Keys - Contra

2)
Surrogate Primary Keys with Clustered index - Pros

Surrogate Primary Keys with Clustered index - Contra

As usual the "truth should be somewhere in between" - what I wanted to
achieve is to stop using surrogate keys almost "religiously" in all and
every table  as I do now - I think I can stop doing that now if I will use
ADO.NET Entity Framework as my main data access/manipulation technology
because ADO.NET Entity Framework will take care for me to "write/generate
lengthy join expressions on-the-fly (under the hood)" as well as it (ADO.NET
Entity Framework) will let me to not rewrite my join expressions in the case
of the data model changes/reorganization.

Thank you.

--
Shamil

P.S. Here are some links related to the subject of this thread:

---
The ADO.NET Entity Framework Overview
http://msdn.microsoft.com/en-us/library/aa697427(VS.80).aspx 

---
Table and Index Organization
http://msdn.microsoft.com/en-us/library/ms189051.aspx 

Tables and Index Data Structures Architecture
http://msdn.microsoft.com/en-us/library/ms180978.aspx 

Logical and Physical Operators Reference
http://msdn.microsoft.com/en-us/library/ms191158.aspx 

--
Clustered Index Structures
http://msdn.microsoft.com/en-us/library/ms177443.aspx 

"If the table has a clustered index, or the index is on an indexed view, the
row locator is the clustered index key for the row. "

Nonclustered Index Structures
http://msdn.microsoft.com/en-us/library/ms177484.aspx 

Heap Structures
http://msdn.microsoft.com/en-us/library/ms188270.aspx 

SQL Statement Processing
http://msdn.microsoft.com/en-us/library/ms190623.aspx 

---
Is it a bad idea to use GUIDs as primary keys in MS SQL?
http://stackoverflow.com/questions/537145/is-it-a-bad-idea-to-use-guids-as-primary-keys-in-ms-sql
---
Should a Composite Primary Key be clustered in SQL Server?
http://stackoverflow.com/questions/389348/should-a-composite-primary-key-be-clustered-in-sql-server

What happens when I drop a clustered primary key in SQL 2005
http://stackoverflow.com/questions/705504/what-happens-when-i-drop-a-clustered-primary-key-in-sql-2005

Where to place a primary key
http://stackoverflow.com/questions/575450/where-to-place-a-primary-key 

What is the optimal indexing strategy for a relation table?
http://stackoverflow.com/questions/851908/what-is-the-optimal-indexing-strategy-for-a-relation-table
----

-----Original Message-----
From: accessd-bounces at databaseadvisors.com 
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Gustav Brock
Sent: Friday, June 26, 2009 12:40 PM
To: accessd at databaseadvisors.com 
Subject: Re: [AccessD] Clustered Index Debate - Was: RE: Primary keys and
entry points

Hi Shamil and John

Reading these chatty fellows is a bit difficult as they refer to quite a lot
before and outside the article and many versions of SQL Server including
6.5, pack it in "if"s and different scenarios, and - as I'm not a DBA -
expressions unknown by me, but to me it looks more like a discussion on
indexes rather than keys with zero conclusion.

However - if that is true - if only one clustered index can be applied to a
table then, of course, it has to be designed with great care considering the
expected purpose and usage of the table and, of course, that may very well
include a natural (compound) key. But I can't see why that should indicate a
farewell to the surrogate key.

How the ADO.NET Entity Framework handles this is really out of our hands but
I guess the old rule still applies: the groundwork sets the quality of the
house.

/gustav

<<< snip >>>