[AccessD] Raid for reliability on SSDs

Sat Jun 13 21:47:27 CDT 2009

William,

 > ...mind if I ask wtf is a "cover index"?

Well... I am not anywhere near an expert... but apparently in SQL Server if you build an index that 
"covers" all of the fields used in a where clause, joins etc for a given query, then the data is 
pulled directly out of the index itself rather than having to go read each row of the table that the 
index covers.  I am guessing that the same might be true for Access as well though I have never 
researched it.

Let's take an example...

I have a real table that is 50 million rows and 640 fields.  This is the center of the client's 
universe.

I might write a query (that joins to another query on PK) such as...

SELECT PK from tblHSID
WHERE (Age = X) AND (Sex = Y) AND  (income in (C,D,E,F,G)) AND
	((HasDogs = Y) OR (HasCats = Y)

You get the idea...

So, if I have a single index on Age, Sex, Income, HasDogs and HasCats then the data is already 
stored in the index and only that index will be used for the where clause.  If any field is missing 
(and for sure if it is not in another index) then SQL Server will have to "Scan" through all 50 
million records looking for the data in the actual fields of the table to get the data to perform 
the where.

As you might imagine, scanning through 50 million records with 640 columns is "not only slow, but 
DAMNED SLOW".

So I set up "cover indexes" for all of the major fields that are in every order.  For example the 
age, sex, income kind of stuff plus the PKID might be in one "cover index".  Then I set up other 
cover indexes over "sets" of fields.  For example the client often does orders for arthritis, and 
there are about 10 fields that ORed together indicate someone that might have arthritis.  So I 
"cover" all of those fields plus the PKID.

Thus an order like that might use TWO "cover" indexes, one for the age/sex/income and a second for 
the arthritis fields, joining on the PKID.  The intent is to allow SQL Server to find all of the 
data in two indexes instead of scanning 50 million records.

I am addressing this response to the SQL Server group as well so that anyone over there who might 
know more, or be able to explain it better, or even tell me I am out of luck that it doesn't work 
that way can speak up.

I can tell you that before I discovered this my queries were taking a half hour to run.  After 
discovering the cover indexes and using them, my times dropped to a minute or two or three or five 
so it does in fact appear to work that way.

So the point NOW... is to place just those indexes in separate files out on a raid of SSDs.  MAYBE I 
will get results down to a 10/20/30 seconds?  When you are dealing with the table sizes that I am, 
and the number of steps to do the stuff that I am, every minute counts.  I can't tell you the number 
of times I start a query and just twiddle my thumbs for 2 or 3 or 5 minutes waiting for the results. 
  Not enough time to switch gears to something else but a huge waste of time none the less.

I am going to try splitting the indexes out and placing them on a RAID 1 array on a pair of normal 
500g hard drives.  This will likely slow me down a tad, or even a lot, since a RAID 6 with 6 drives 
(where they are now) streams data much faster than a raid 1 will.  However I will have figured out 
how to split the indexes out, and have them in separate files when I finally get the SSDs. 
Furthermore I do have extra 500g drives hanging around I can add to the array if I need to get the 
streaming speed up temporarily until I get the SSDs.

I am willing to bet a fairly large chunk of money that the indexes on a raid SSD will be smokin' 
fast.  In fact I will likely drop them on their own controller to give them their own coprocessor 
and their own path to the CPU.

If this works my speeds will be in the big leagues.  Or at least bigger leagues. ;)

John W. Colby
www.ColbyConsulting.com

William Hindman wrote:
> ...mind if I ask wtf is a "cover index"?
> 
> William