[dba-SQLServer] How are indexes built

Arthur Fuller fuller.artful at gmail.com
Thu Nov 29 08:02:26 CST 2007


JC:

First of all, you need to understand what a clustered index is, and what it
is not. Perhaps you already know, but in case not, let me try to explain, in
perhaps simplistic terms.

Creating a clustered index physically re-orders the table according to the
order of said index. That will take a while, given the size of your table.
It will also consume (temporarily) a large amount of disk space, since it
has to re-order the table in tempdb and then write it back to yourdb.

>From what I have gathered about your app, I see not a single advantage to
your creation of a clustered index. A clustered index (again typically)
would be most useful in a compound-index situation, i.e. index on Order
Number + Order Detail Number, so gathering up the Order Details would
require minimal visits to the tables. Your case is much different than this,
and AFAIS you would gain nothing by creating a clustered index, and
conversely, lose all the time it will require to physically re-order said
table.

IMO, your largest problem is the width of the table. Whenever I detect
several hundred columns, I detect a wrong approach. We have previously
discussed this, and I hold my ground. All these columns (or at least many of
them) ought to be rows in a child table, not columns in the principal table.
Then you could creatively use indexes. Imagine a hundred columns whose
contents are Y/N/null. No point in recording the nulls; only the Y/Ns are
significant. Give all the current columns a PK and a meaningful description
in the "attributes" table, and then index said PK. Yes, this will result in
many millions of rows, one for every Y/N attribute in your big fat table.
But it will make queries such as attribute A + attribute B + attribute C -
attribute D much simpler. This would translate in the new scheme to an
EXISTS and a NOT EXISTS clause, and since all this could be handled by the
index, relatively few disk hits would be required (relative to the
full-table scans your current model requires for almost every query).

I know you receive new records occasionally, and porting said new records to
this model will involve creating new rows in said child table and then
populating them with the PK from the user, the PK from the new attribute,
and the immediate value.

I would vastly prefer going down that road to the full-table-scan scenarios
you are currently locked into.

Arthur

On 11/29/07, Martin W Reid <Mwp.Reid at qub.ac.uk> wrote:
>
> John
>
> You may find this useful
>
> http://www.mssqlcity.com/Articles/Adm/index_fragmentation.htm
>
>
> Martin
>
>
> Martin WP Reid
> Information Services
> Queen's University
> Riddel Hall
> 185 Stranmillis Road
> Belfast
> BT9 5EE
> Tel : 02890974465
> Email : mwp.reid at qub.ac.uk
> ________________________________________
> From: dba-sqlserver-bounces at databaseadvisors.com [
> dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of jwcolby [
> jwcolby at colbyconsulting.com]
> Sent: 29 November 2007 13:17
> To: dba-sqlserver at databaseadvisors.com
> Subject: [dba-SQLServer] How are indexes built
>
> I am in the tail end of rebuilding the "fact table" for the big database
> from hell.  Long ago Arthur managed to get me an "optimum field size" for
> each of the fields.  I stripped off a ton (100?) fields out at the
> beginning
> of the table which were very old address validation stuff, and then build
> a
> new table without those fields, and with the rest of the fields set to
> exactly the right size to fit the data - usually char(1) rather than the
> default varchar(50) which the old table used.  I am now in the "index it"
> phase.  Keeping in mind that this data is never updated, once the data is
> in
> place indexes do not have to deal with the insertion problem.
>
> One issue that I have is that in order to keep the speed up, I did not
> define a PK field or any indexes before I inserted the data.  Now the data
> is so huge that I cannot get it to build (designate) a PK field without a
> timeout, or at least it will not work from the built in table modify
> wizard.
> It starts chugging away and then comes back a couple of minutes later with
> a
> timeout message and fails to complete.  So I do not have a real PK.  I do
> have a field that contains the data (int) that is the number that
> represents
> a record (the PK field), or is just not actually considered a PK by SQL
> Server and I am not sure how to make SQL Server turn it into a PK.  I
> think
> I can do so with an actual SQL statement run as a query but I will have to
> research the syntax.  For some reason, SQL Server will run QUERIES to
> completion, even if they take hours, but it will cause timeouts for any of
> the built in wizards that attempt to modify a table, or for that matter
> will
> time out views if they take too long.  Cut the view SQL into a query and
> it
> will run to completion.  Obviously one of many things I do not understand
> about my tool.
>
> So... I am trying to build up indexes to help me with select WHERE clauses
> for this beast.  I have discussed some of the types of where clauses I
> get,
> IN(a,b,c,d,e) for a single field, Fld1=A and Fld2=C, etc.  In order to
> bring
> sanity to the table and allow me to test pieces of these select queries, I
> am starting to use a strategy of building sub queries to handle the pieces
> of a complex where clause, and then using a JOIN for the AND.  In order to
> do this I need indexes on the fields used in each sub query.  It is
> possible
> to create cover queries where I include the PKID (field) and the field
> used
> in the select.
>
> I really need to build a clustered index on the PKID field (I think) since
> that is what I use to actually reference the entire record, for those
> cases
> where it has to pull data not indexed or where an index is ignored.  I
> will
> need to do that "over the weekend" I think given the physical sort
> involved.
> Will it have to completely rebuild all of my other indexes as well?  I
> suspect so.  I really need a single SP that will drop all of my indexes
> and
> another to rebuild all of my indexes.  Then I could just "drop the names"
> in
> an SP and cause the whole lot to build.
>
> I think I am going to have to join a group specializing in SQL Server in
> order to get these kinds of questions answered.  Kind of a scary prospect.
> I am so "developer oriented" and so ignorant about the workings of the SQL
> Server innards.
>
> John W. Colby
> Colby Consulting
> www.ColbyConsulting.com
>
> _______________________________________________
> dba-SQLServer mailing list
> dba-SQLServer at databaseadvisors.com
> http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
> http://www.databaseadvisors.com
>
> _______________________________________________
> dba-SQLServer mailing list
> dba-SQLServer at databaseadvisors.com
> http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
> http://www.databaseadvisors.com
>
>



More information about the dba-SQLServer mailing list