[dba-SQLServer] SPAM-LOW: Re: Do I need a cover index?

Mark Breen marklbreen at gmail.com
Thu Dec 3 18:29:13 CST 2009


Hello John,

thanks for the responses.

[Well, I am reading that by creating a clustered index, every single data
element (field) in that row
is stored together, AND the rows are physically sorted on the index key -
the PKID in this case.]

I would not have assumed this.  Even without a clustered, I would expect
that SQL Server would store the columns of one row as close to each other as
physically possible, given the limitations of the data sizes.  Somewhere in
the back of my mind I am even wondering is that what the 8000 char limit is
for, ie, that 8k blocks are reserved for a row of data.  Again, that may be
ten years out of date now with SQL 2008.


Just one final point to comment on.  I also do not know what the heap is,
but I would guess that your heap is neat and tidy as the data is effectively
created sequentially, so, it probably looks the same with or without the
clustered PK.  No idea how to check this though.


Thanks

Mark


2009/12/3 jwcolby <jwcolby at colbyconsulting.com>

>  >However, if you are joining on the PKID, but are filtering "Where Field47
> = 'Ice Cream and Jelly',
> then you should have an index on Field 47 also.
>
> As it happens I do perform a where on the data field so a cover is
> required.
>
>  > Anyway, in your case, it seems all irrelevant, as you do a mass import
> and the immediately create
> your indexes.
>
> Correct, after which no records are ever deleted or added.  I do (or may)
> modify fields but not the
> PK of course.
>
>  > One last question, having just re-read your email, I see that you are
> talking about / hoping that
> a clustered index may keep the *fields *together.
>
> Well, I am reading that by creating a clustered index, every single data
> element (field) in that row
> is stored together, AND the rows are physically sorted on the index key -
> the PKID in this case.
>
> OTOH if you do not create a clustered index, then the data elements just go
> "in the heap" with
> pointers to the data in the heap maintained in the index.  Again though,
> what does that mean?  I
> know what a heap means in memory for a program, but it is a little
> difficult for me to equate that
> to a db file.  I have no concept of what the structure of a db file looks
> like, where this "heap"
> might be etc.  But it is always spoken of negatively so it must be bad.
>
>  > BTW, one last question, when you create new databases, do you create the
> db as 1 mb and allow it
> to grow...
>
> I do.  Yes it would be faster to create it initially as something bigger
> but it is difficult to know
> how big is big enough and in the end this is not enough of a problem to
> worry about.  In the end I
> do the month to month processing in the same file, over and over again.  I
> am actually considering
> (eventually) creating a temp database to use to get the data exported /
> imported etc.  The nice
> thing about SQL Server is that you can simply specify the db name that you
> want to create a table
> in, append data in etc.  So I could do a temp database, temp tables for the
> export and then just
> delete the db when the export is finished.  Likewise for the import.  Temp
> db, temp table(s) then
> append into the "real" table, or update records in the "real" table.
>
> But that is down the road.
>
>  >If so, do you keep a handy, ready to go, empty 47 GB db lying around?
>
> Well, empty or not, 47 gigs is 47 gigs and copying that is slooooowwwww.
>  You would lose some or all
> of the hoped for efficiency in the copy.
>
> John W. Colby
> www.ColbyConsulting.com
>
>
> Mark Breen wrote:
> > Hello John,
> >
> > I would have thought the the most important thing to consider is what
> > columns you will join on and what columns you will filter on.
> >
> > So, if you are only retrieving based on the PKID then I see no need to
> have
> > any additional index.  However, if you are joining on the PKID, but are
> > filtering "Where Field47 = 'Ice Cream and Jelly', then you should have an
> > index on Field 47 also.
> >
> > Regarding Clustered vs non Clustered, I believed those to be highly in a
> > highly trafficed database where records are coming and going all the
> time.
> >  In those cases, the indexes can become fragmented in a similar way that
> > hard disks get fragmented.  More importantly, in a high volume, data
> entry
> > system, I understand that it can dramatically increase performance if you
> do
> > NOT cluster on the PK.  (the following may be ten years out of date).
>  The
> > reason to Cluster on the Non-PK fields as that multiple records for
> Invoices
> > 99, 100, 101, 102 would all be written to the same page within the db,
> and
> > if that page is locked for invoice 103, then another operator cannot
> raise
> > invoice 104 until 103 is completed.  This was the logic I was thought in
> > 1997 in clustering on another column such as CustomerId instead of
> > InvoiceId.  I really do not know if that is still relevant nowadays.
> >
> > Anyway, in your case, it seems all irrelevant, as you do a mass import
> and
> > the immediately create your indexes.  IOW, your indexes are in perfect
> > condition.  You probably only use them two or three times before you
> abandon
> > that db for the next one.
> >
> > One last question, having just re-read your email, I see that you are
> > talking about / hoping that a clustered index may keep the *fields
> *together.
> >  I would have assumed 99% confidently that the fields must always be kept
> > together (as you say, what ever that might mean), but the clustering of
> > indexes only relates to keeping *records *together, not columns together.
> >  So, if that is the case, you do not require a clustered index to keep
> > columns together, ie, the must always travel together.  I have no idea
> how
> > to measure that.
> >
> > Am I totally off beam here, is the problem that I do not know what a
> cover
> > index is?
> >
> > BTW, one last question, when you create new databases, do you create the
> db
> > as 1 mb and allow it to grow, or do you initially create it as 47 gb, and
> > then just populate it with what ever arrives each month.  Is is faster to
> do
> > your imports to a db that is already expanded up.  If so, do you keep a
> > handy, ready to go, empty 47 GB db lying around?
> >
> > thanks
> >
> > Mark
>
> _______________________________________________
> dba-SQLServer mailing list
> dba-SQLServer at databaseadvisors.com
> http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
> http://www.databaseadvisors.com
>
>



More information about the dba-SQLServer mailing list