jwcolby
jwcolby at colbyconsulting.com
Thu Jan 5 16:43:23 CST 2012
Guys, My client has a web page hosted on a RackSpace (the name of the hosting provider) server. It runs Windows 2008 Standard, appears to be a physical (as opposed to virtual) server, has 12 gigs and 4 cores. It runs SQL Server 2008 Standard. the C: drive is 300 gigs of which about 109 gigs are used. The client does meail ad campaigns for things like Aleve pain reliever etc. There is a SQL Server database which stores information about what happens on the web page, hits, clicks moves between pages etc. The specific database that stores all this info says that it has a size of 60.33 gigs. Of that the data file is about 5.56 gigs with 67% free and the log file is 54.77 gigs with 16% free. So... I just got involved with this stuff. I didn't build the database nor the web site. I was asked to go find out why a specific statistic (time on site) was "out of whack". My question really is that the statistics are coming from stored procedures, mostly out of one specific table EVNT_Events. That table has no indexes on it. It does have an autoincrement field but has no PK per se and of course is a heap structure. What I know so far is that events are logged on a session id, that session id is apparently a unique identifier for a "session" or a specific browser logged into that web site. An event record is logged every 10 seconds, but perhaps more because there are "event types" such as clicks, backbutton, etc. The following is a count of events for this campaign between 11/25 and 1/5, roughly 2.4 million events over 40 days, which is an average of about an event every 100 seconds. Not exactly a high traffic campaign. Which is not to say the next one won't be but let's be real, click throughs from an email campaign isn't a high volume business. 2372576 2011-11-25 00:01:50.770 2012-01-05 14:10:08.037 The statistics report that runs takes awhile to run. I would bet that some judicious indexing would speed things up. At the very least making a clustered index on the autoincrement EVNT_ID. Of course reporting is not the primary concern here. Still... So assuming I get permission to do anything, what do I do? I don't want to make any changes without first backing up the database. I would highly doubt that anything has ever been backed up though I will be asking these questions. 1) This appears to be a standard SQL Server setup, i.e max memory is insane, all cores assigned to SQL Server, compression is not turned on and so forty. Is there a reason not to compress the backup? 2) The log file contains a lot of "stuff". 3) Performing a backup to the same physical drive is going to take awhile and use a lot of disk space, even if compressed. 4) Once backed up what happens to that log file? Does it automatically "empty"? 5) Would creating a clustered index on that "PK" help make things faster? 6) Many of the statistics group by the session ID. Obviously indexing that and a few other key fields would speed up reporting but slow down creating these log records but would it slow it down enough to make any difference? It seems unlikely to me but I don't have much experience in a situation like this. What I do know is that once created these event records are apparently never modified. I am just trying to get some "feel" for what will work and what to recommend. I am to a large extent the tech support for the client. The web site and the campaigns are created by a third party and that will continue. It just looks to me like they are not spending much time on the database side of things. -- John W. Colby Colby Consulting Reality is what refuses to go away when you do not believe in it