JWColby
jwcolby at colbyconsulting.com
Fri Dec 22 09:25:42 CST 2006
Arthur, This would tend to suggest that in the system you and I were discussing, placing all of the data values in one big table would be inefficient until you got the number of "From Table" data up above 20. You would have to do a full table scan on hundreds of millions of values. Using the alternate method of breaking HSID down into individual tables representing the data "groups" and only filling the tables with records where that data was present would mean you would have many different source tables, but each would have smallish numbers of records, anywhere from hundreds of thousands to low millions (5 million). Remember that many fields have essentially "binary" values such as 'True' / 'False'. Even these small tables a table scan will be required on fields like these (if they are even used in a filter) but at least you are scanning a table with a small number of records. John W. Colby Colby Consulting www.ColbyConsulting.com -----Original Message----- From: dba-sqlserver-bounces at databaseadvisors.com [mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of Stuart McLachlan Sent: Friday, December 15, 2006 12:47 AM To: dba-sqlserver at databaseadvisors.com Subject: Re: [dba-SQLServer] Indexing on a small number of values in a column On 14 Dec 2006 at 21:19, artful at rogers.com wrote: > I have heard or read that creating an index on a column containing a > small number of values is pointless. > > a) Given a column Sex with a constraint "M/F", is there any point > indexing this column? b) Given the same column defined as a bit, you > can't index it (afaik). c) Assuming that it is pointless to index a > column containng < n distinct values, what is the value of n? Assume > 100M rows in the table of interest? See: http://msdn2.microsoft.com/en-us/library/aa224773(sql.80).aspx "My tests show that a table scan often starts to perform better than a nonclustered index access when at least 10 percent of the rows are selected. I also found that the optimizer switches from nonclustered index access to table scan prematurely (i.e., when nonclustered index access still shows better response time than the corresponding table scan). In many cases, the optimizer forces a table scan for queries with result sets of approximately 5 percent, although the table scan becomes more efficient than index access at selectivities of 8 to 10 percent." also http://www.sql-server-performance.com/am_indexing_low_sel_cols.asp for some real life tests and ideas on how to measure the effectiveness of an index. -- Stuart _______________________________________________ dba-SQLServer mailing list dba-SQLServer at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/dba-sqlserver http://www.databaseadvisors.com