[dba-SQLServer] How does a clustered index really work

Sat Oct 31 04:48:19 CDT 2009

Hi John:

This explanation works for me:

http://weblogs.sqlteam.com/mladenp/archive/2007/09/18/Back-To-Basics-What-is
-a-Clustered-and-a-Non-Clustered.aspx

Jim

-----Original Message-----
From: dba-sqlserver-bounces at databaseadvisors.com
[mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of jwcolby
Sent: Friday, October 30, 2009 12:41 PM
To: Dba-Sqlserver
Subject: [dba-SQLServer] How does a clustered index really work

I need more information about how a clustered index really works.  I work
with some fairly large 
tables, in this specific case 67 million records.  The table has a PKID
(Identity) as well as a PK 
from another table (foreign key).  It then has name / address fields, plus
about 8 non name / 
address fields.

I have come to the conclusion that I need to update this table about once a
month or so (address 
validation stuff).  The updates will affect a smallish subset of the total
records, perhaps a 
million or so records, maybe less (about 2% of the total records, once a
month).

I have read that using a clustered index speeds up any operations that pull
data ONLY from fields in 
the clustered index because of the way the data is stored on disk.  My
understanding is that those 
fields are stored contiguously, whereas other indexed fields are stored in
leaf nodes in a b-tree. 
I  understand a b-tree, though the knowledge is old.

Because of some of the processes I run I am wondering whether I should take
the hit of placing the 
PK/FK fields plus the name / address fields in a clustered index.  I use
just those fields a LOT. 
There are some additional fields that I use sporadically.  Other than the
PK/FK  and NAME fields, 
all the other fields may be updated, though as I mentioned, rarely more than
about 2% of the total 
records.

My understanding is that IF the changes in the data in the clustered index
cause the ORDER of the 
data to change, then the updates to those fields would be slow, but if the
order never changes then 
clustered index field updates aren't particularly slow.

Am I right so far?

Because the first field in the clustered index would be the Identity field,
the ORDER of the records 
will NEVER change.  Thus my read is that I can safely store any of the other
fields I want without 
worrying about causing that particular performance hit on updates.

Question, because of the storage method on disk, are the updates to
clustered index fields FASTER 
than they would be otherwise, IOW getting at the data is faster, the order
doesn't change so the 
update proceeds faster?

I am a SQL Server wannabe.  I use it a LOT but at the same time I never see
myself becoming an 
expert, never mind a guru.  Is there something I can read about this subject
that addresses my 
level, that I could fairly easily get my questions about this answered?

TIA for your patient responses.  ;)

-- 
John W. Colby
www.ColbyConsulting.com
_______________________________________________
dba-SQLServer mailing list
dba-SQLServer at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
http://www.databaseadvisors.com