[dba-SQLServer] SPAM-LOW: Re: Do I need a cover index?

Thu Dec 3 19:13:00 CST 2009

Some points to consider:
1. A clustered index on the PK will be useful for the join operation.
2. A non clustered index on the filtering field will be useful for the
search operation.
3. In your scenario the best choice IMO is a clustered index on the PK and a
non clustered index on the filtering field. As mentioned in my previous
posting this index will automatically include the values of the PK in the
bottom of its index tree (the "leaf level"). Perhaps your searches don't
specify the PK value (I guess not, since you use surrogate keys, which of
cause are not candidates for searches). But nice to know anyhow - it
certainly should prevent you from creating a composite index on the PK plus
your filtering field: this kind of index would be an enormously waste of
space and performance.
4. If you don't have a clustered index, then your table is a "heap". This
means that the records can go anywhere in the physical file. A heap can be
very efficient for inserts, because SQL Server then don't have to bother
where to drop the records, it just scatters the records wherever room is
available. But it is very-very inefficient for searches and joins, because
then SQL Server has to consult a special internal table named IAM to find
where it happened to drop your damned records. That's why it's normally bad.
But you could imagine situations where this would be very well, for example
an auditing table populated by a trigger.
5. If you want to make your inserts create huge amounts of disk
fragmentation then just go ahead using a small-sizes db relying on auto grow
of your db. If you don't want that then follow Mark's advice and size your
db initially to the anticipated size.
Asger
-----Oprindelig meddelelse-----
Fra: dba-sqlserver-bounces at databaseadvisors.com
[mailto:dba-sqlserver-bounces at databaseadvisors.com] På vegne af jwcolby
Sendt: 4. december 2009 00:24
Til: Discussion concerning MS SQL Server
Emne: Re: [dba-SQLServer] SPAM-LOW: Re: Do I need a cover index?

 >However, if you are joining on the PKID, but are filtering "Where Field47
= 'Ice Cream and Jelly', 
then you should have an index on Field 47 also.

As it happens I do perform a where on the data field so a cover is required.

 > Anyway, in your case, it seems all irrelevant, as you do a mass import
and the immediately create 
your indexes.

Correct, after which no records are ever deleted or added.  I do (or may)
modify fields but not the 
PK of course.

 > One last question, having just re-read your email, I see that you are
talking about / hoping that 
a clustered index may keep the *fields *together.

Well, I am reading that by creating a clustered index, every single data
element (field) in that row 
is stored together, AND the rows are physically sorted on the index key -
the PKID in this case.

OTOH if you do not create a clustered index, then the data elements just go
"in the heap" with 
pointers to the data in the heap maintained in the index.  Again though,
what does that mean?  I 
know what a heap means in memory for a program, but it is a little difficult
for me to equate that 
to a db file.  I have no concept of what the structure of a db file looks
like, where this "heap" 
might be etc.  But it is always spoken of negatively so it must be bad.

 > BTW, one last question, when you create new databases, do you create the
db as 1 mb and allow it 
to grow...

I do.  Yes it would be faster to create it initially as something bigger but
it is difficult to know 
how big is big enough and in the end this is not enough of a problem to
worry about.  In the end I 
do the month to month processing in the same file, over and over again.  I
am actually considering 
(eventually) creating a temp database to use to get the data exported /
imported etc.  The nice 
thing about SQL Server is that you can simply specify the db name that you
want to create a table 
in, append data in etc.  So I could do a temp database, temp tables for the
export and then just 
delete the db when the export is finished.  Likewise for the import.  Temp
db, temp table(s) then 
append into the "real" table, or update records in the "real" table.

But that is down the road.

 >If so, do you keep a handy, ready to go, empty 47 GB db lying around?

Well, empty or not, 47 gigs is 47 gigs and copying that is slooooowwwww.
You would lose some or all 
of the hoped for efficiency in the copy.

John W. Colby
www.ColbyConsulting.com

Mark Breen wrote:
> Hello John,
> 
> I would have thought the the most important thing to consider is what
> columns you will join on and what columns you will filter on.
> 
> So, if you are only retrieving based on the PKID then I see no need to
have
> any additional index.  However, if you are joining on the PKID, but are
> filtering "Where Field47 = 'Ice Cream and Jelly', then you should have an
> index on Field 47 also.
> 
> Regarding Clustered vs non Clustered, I believed those to be highly in a
> highly trafficed database where records are coming and going all the time.
>  In those cases, the indexes can become fragmented in a similar way that
> hard disks get fragmented.  More importantly, in a high volume, data entry
> system, I understand that it can dramatically increase performance if you
do
> NOT cluster on the PK.  (the following may be ten years out of date).  The
> reason to Cluster on the Non-PK fields as that multiple records for
Invoices
> 99, 100, 101, 102 would all be written to the same page within the db, and
> if that page is locked for invoice 103, then another operator cannot raise
> invoice 104 until 103 is completed.  This was the logic I was thought in
> 1997 in clustering on another column such as CustomerId instead of
> InvoiceId.  I really do not know if that is still relevant nowadays.
> 
> Anyway, in your case, it seems all irrelevant, as you do a mass import and
> the immediately create your indexes.  IOW, your indexes are in perfect
> condition.  You probably only use them two or three times before you
abandon
> that db for the next one.
> 
> One last question, having just re-read your email, I see that you are
> talking about / hoping that a clustered index may keep the *fields
*together.
>  I would have assumed 99% confidently that the fields must always be kept
> together (as you say, what ever that might mean), but the clustering of
> indexes only relates to keeping *records *together, not columns together.
>  So, if that is the case, you do not require a clustered index to keep
> columns together, ie, the must always travel together.  I have no idea how
> to measure that.
> 
> Am I totally off beam here, is the problem that I do not know what a cover
> index is?
> 
> BTW, one last question, when you create new databases, do you create the
db
> as 1 mb and allow it to grow, or do you initially create it as 47 gb, and
> then just populate it with what ever arrives each month.  Is is faster to
do
> your imports to a db that is already expanded up.  If so, do you keep a
> handy, ready to go, empty 47 GB db lying around?
> 
> thanks
> 
> Mark

_______________________________________________
dba-SQLServer mailing list
dba-SQLServer at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
http://www.databaseadvisors.com