[AccessD] SQL Saturday

Sat Mar 6 21:10:20 CST 2010

I went to the SQL Saturday 33 down in Charlotte today.  As a non-dba it was mostly over my head but 
two sessions were most useful to me.  One was on "the state of (solid state) disks" which was about 
where SSDs are today and what they can and cannot do for the database.  The consensus is pretty much 
that they are not yet ready for prime time.  That said the enterprise versions (read BIG BUCKS) are 
being used today!

And yet for read-only applications they shine.  Mine is a read-only application so I think I will 
proceed carefully on that front.

The other session that was very interesting to me is SQL Server 2008 compression.  This is currently 
only in Enterprise and (lucky for me) developer's edition (which is enterprise).  I knew about the 
backup compression and started using it a couple of months ago now, but was not previously aware of 
the data compression side.

So when I got home I started testing data compression on my databases.  Interesting but probably not 
overly useful for me.  The main table compressed about 50% IF I rearranged my CLUSTERED index to get 
all of the rows in Zip 5/4 order.  However indexes ended up more like 18% or so.  I had previously 
used just my int32 PKID as the key on the CLUSTERED index, but the compression was very slight with 
that.  Reindexing on zip5/4/pkid bumped that up to 50%... for the main table.  Sadly the indexes 
stayed well below 20%.

SQL Server can compress at the row or page level and in order to get anything truly useful I had to 
go to the page.  Even then the indexes just didn't compress spectacularly well.  The problem is that 
they stay compressed even in memory.  While that sounds good, and would be if you got a great 
compression ratio, it means overhead at the instant of use every time a value is used.  50% would 
probably be worthwhile.  18%... I am guessing not.

I am not really knowledgeable enough to do good testing, but my gut tells me that 18% on indexes 
will be about the break even point or even less in terms of performance gain.  The idea of course is 
that compression allows more data to move on/off the disk in a given time unit, and likewise allows 
more data to stay in memory at any given time.  However that is at the expense of CPU overhead to do 
the read decompression as EACH DATA ITEM is used.  Every time it is used.  Indexes in particular 
would take a hit exactly because they are used so intensively for joins and where clauses.

I had high hopes but I now doubt the usefulness in my application.  If anyone out there is using 
this data compression and can chime in from your experience please do.  One thing that MIGHT save 
the day is that apparently the granularity of the compression is fairly tight, i.e. you can specify 
that any given index is or is not compressed.  It really sounds like a tuning nightmare to me.

OTOH I do use the compression for backups, and am loving it.  Very significant file size reduction 
AND speed improvements.

any opinions out there?

-- 
John W. Colby
www.ColbyConsulting.com