[dba-SQLServer] How to construct queries

Fri Dec 11 18:44:11 CST 2009

John,
It will be useful for you to explore the execution plans for your queries.
You do so by highlighting the query in SSMS and pressing CTRL+L.
The Execution Plan window may be overwhelming but you just have to focus on
the rightmost ikon which is indicating either "Table scan" or "Index Scan"
or "Index seek".
Now for your question:
To understand what's going on you have to distinguish *fetch operation* and
*search operation*. You also have to distinguish a *composite index* and an
*index with included columns*.
A covering index means that SQL Server can *fetch* the records you are
selecting in your query from the leaf level (=bottom level) of your
index-tree, not having to make a lookup to the data pages.
But a covering index doesn't mean that SQL Server can use your index for its
*search* operations. SQL Server can only use the index for a *search*
operation if your where-condition references a field which is the *first*
field defined in the index.
And to make things more complicated: A covering index can be accomplished in
to ways: as a *composite index* or as an *index with included columns*.
A composite index is created this way:
CREATE INDEX MyCompositeIndex ON MyTable (FieldA, FieldB, FieldC, FieldD)
An index with included columns is created this way:
CREATE INDEX MyIncludedIndex ON MyTable (FieldA) INCLUDE (FieldB, FieldC,
FieldD)
A *composite index* records all specified columns in every level of the
index-tree, and it is very useful for multi-field *sorting* operations. But
it is not useful for multi-field *search" operations, because SQL Server
will only consider using this index for a query referencing the *first*
field (here FieldA) in the where-condition. It is however very useful for
*fetch* operations, since SQL Server then can do with an "index scan" on the
leaf level of the index not having to make a lookup the actual data-pages.
This is only true however if you are selecting fields which are present in
the index: if your query selects fields not defined in an index then SQL
Server has to use a *table scan* which is not efficient. And now you may
ask: What about a SELECT COUNT(*) - isn't this referencing all fields and
shouldn't this per definition prevent SQL Server from using the index? Don't
worry: COUNT(*) is a special case where SQL Server will use the index if it
can make the execution more efficient.
An *index with included columns* only records the included columns in the
leaf level of the index. For this reason it is neither useful for
multi-field *sorting* operations nor for *search* operations on the included
columns. But as for *composite indexes* it is very useful for *fetch*
operations.
Bottom line:
A *composite index* is good for multi-columns sorting and covering queries,
but it doesn't help *search* operations and it imposes an overhead forcing
SQL Server to record all fields in every level of the index-tree. An *index
with included column* it good for covering queries too, but it doesn't help
multi-columns sorting and it doesn't help *search* operations either -
however it imposes fare less overhead cause the included fields are only
recorded in the leaf-level of the index.

I suggest you play whith the different options using CTRL+L - pretty soon
you will notice this:

Composite index on FieldA, FieldB, FieldC
Composite index on FieldD, FieldE, FieldF
SELECT COUNT (*) FROM MyTable WHERE FieldA="Y" --> Index Seek
SELECT COUNT (*) FROM MyTable WHERE FieldB="Y" --> Index Scan
SELECT COUNT (*) FROM MyTable WHERE FieldD="Y" --> Index Seek
SELECT COUNT (*) FROM MyTable WHERE FieldE="Y" --> Index Scan
SELECT COUNT (*) FROM MyTable WHERE FieldA="Y" OR FieldB="Y" OR FieldC="Y"
--> Index Scan
SELECT COUNT (*) FROM MyTable WHERE FieldA="Y" OR FieldD="Y" --> Perhaps two
index seeks merged
SELECT COUNT (*) FROM MyTable WHERE FieldA="Y" OR FieldE="Y" --> Probably a
table scan

Index on FieldA including FieldB and FieldC
Index on FieldA including FieldD and FieldF
--> exactly same results

Another option is to use a separate index on each field. Then you will
notice:
SELECT COUNT (*) FROM MyTable WHERE FieldA="Y" --> Index Seek
SELECT COUNT (*) FROM MyTable WHERE FieldA="Y" OR FieldB="Y" --> Perhaps two
index seeks merged
SELECT COUNT (*) FROM MyTable WHERE FieldA="Y" OR FieldB="Y" OR FieldC="Y"
--> Probably a table scan

Mind that the results will depend on the actual values present in your
table. SQL Server maintains statistics of the indexed columns and if there
is a low selectivity for you query (= a high rate of matching values in the
indexed search-field) then chance is that SQL Server don't want to use the
index and will prefer using a table scan instead.

 Asger

-----Oprindelig meddelelse-----
Fra: dba-sqlserver-bounces at databaseadvisors.com
[mailto:dba-sqlserver-bounces at databaseadvisors.com] På vegne af jwcolby
Sendt: 11. december 2009 19:29
Til: Dba-Sqlserver
Emne: [dba-SQLServer] How to construct queries

I often have to do things like "supply counts of records where..." and the
list will be something like:

FieldA = "Y" or FieldB = "Y" or FieldC = "Y" ...

Often times the counts must be for sets of fields, for example FieldA
through FieldG or FieldX 
through FieldZ

These fields are usually a small minority of the total 50 million records,
sometimes a million, 
sometimes 5 million etc.

So I have routinely created cover indexes for the groups, one index for
FieldA..FieldG, another 
index for FieldX .. FieldZ.

If a query requests data from fields in both groups, will the query use both
cover indexes to grab 
the data for the records?

Is there a more efficient way to do these kinds of counts?

-- 
John W. Colby
www.ColbyConsulting.com
_______________________________________________
dba-SQLServer mailing list
dba-SQLServer at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
http://www.databaseadvisors.com