jwcolby
jwcolby at colbyconsulting.com
Thu Mar 27 11:50:58 CDT 2008
David, It is just an encapsulation technique to me. In the case of count orders, the entire database is only a meg or so file size. The business is just complex enough to need to completely encapsulate the entire order and save it for later. I actually have about 5 or 6 different "list" databases from which I pull names and addresses. Most come from just one but occasionally the client references another. By encapsulating the database I get exact copies of any slight modifications I have to make to the views that filter down the result sets. Likewise the client occasionally changes his count requirements, adding new counts etc. By having an exact copy of what I used in a past order I can go back to that database and be working in exactly the environment that I was originally working in. It would perhaps make less sense if these were gigabyte databases but USUALLY they are not. Very occasionally, if they ask for millions of names in the result set, then the db gets up to a fairly hefty size but in the end I still need to be able to see exactly what I did to get my results without any "hmmm did the changes I made three months ago change what I am getting on this order". If I tell the client that he has 2.45 million records, I darned well better be able to show HOW I arrived at those records, and pull exactly those names if he comes back later. As I mentioned, I disconnect the old databases so that I don't see them from inside of SQL Server. As they get really old I RAR them to get them out of the database directory. I can always get back to them though. My client has told me that another fulfillment house he used could never come back later and give him consistent results, could never say how or why they came up with the names that they came up with. I can do that using this technique. John W. Colby Colby Consulting www.ColbyConsulting.com -----Original Message----- From: dba-sqlserver-bounces at databaseadvisors.com [mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of David Lewis Sent: Thursday, March 27, 2008 12:15 PM To: dba-sqlserver at databaseadvisors.com Subject: Re: [dba-SQLServer] dba-SQLServer Digest, Vol 61, Issue 44 OK, I think I understand this a little better, but I get the feeling that copying the result set to a new DATABASE is overkill -- why not just to a new table within the current db? Creating databases is a fairly drastic solution, in my opinion. There is a lot of overhead to them, all sorts of views, procedures, roles, etc associated, even when they are not explicitly used. I think it will unnecessarily bog down your server. Tables are much easier to manage. Just my opinion. D. Lewis ------------------------------ Message: 5 Date: Thu, 27 Mar 2008 11:54:19 -0400 From: "jwcolby" <jwcolby at colbyconsulting.com> Subject: Re: [dba-SQLServer] dba-SQLServer Digest, Vol 61, Issue 38 To: "'Discussion concerning MS SQL Server'" <dba-sqlserver at databaseadvisors.com> Message-ID: <000601c89022$d1f2d7a0$0201a8c0 at M90> Content-Type: text/plain; charset="us-ascii" David, There are several reasons for saving the results. First, I take the addresses and run them through Address validation to find anyone who has moved. Once I do that, there are now people in the original list that are no longer in the zips specified. So I have to get rid of anyone not in the zips specified. Second, the lists potentially contain "duplicate" records. Often times the same person answered questions in several different questionnaires and thus got their name in the db multiple times but with completely different data fields filled in. Also, there may be several members of the same household. The client wants ONE piece of mail per address, so I have to DEDUPE the result set. And finally the client asks for lets say 100K records out of 2.4 million. He wants me to MARK those records as shipped so that he can come back next month and ask for another 100K records. His clients do "test runs" to see what the response is. If the response is good, they come back for more, but each time they come back they want new records. In fact, if the time between orders is long enough, I have to go do the Address Validation, remove moved people thing all over again. I could in fact store just the PKs of the records shipped to handle that part, but I have found a date field to be a good flag of shipped and also tells me WHEN those records were shipped. It is for these reasons that I copy a template each time I fill a NEW order. I store the WHERE views that do the data selection, I do the deduping, and I store flags of records shipped, all inside of the new copy of the template. I name the database with the order name and I can always get back to it. I take the order databases offline after a few weeks (to keep the database list less busy), but can always find them again if the client comes back asking for more addresses. I am in fact trying to build a "business" database that stores all of MY business stuff having to so with this. The clients the order goes to, the order specifications, the directories and databases that the stuff is stored in, how long it took to do stuff (for billing) and all that. However that is the "little database" that I mentioned as an aside, not the one that I get the data from to fill the orders. John W. Colby Colby Consulting www.ColbyConsulting.com -----Original Message----- From: dba-sqlserver-bounces at databaseadvisors.com [mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of David Lewis Sent: Thursday, March 27, 2008 11:02 AM To: dba-sqlserver at databaseadvisors.com Subject: Re: [dba-SQLServer] dba-SQLServer Digest, Vol 61, Issue 38 Hi John: A few thoughts unrelated to the copy database issue: I'm not sure I follow everything you do, but it sounds like you receive raw data, scrub it, then use it as a basis of reports (run queries, send the results to the client...). It also sounds like you physically save the recordset results of those queries, and that the underlying base table does not change. If that is the case, why not save only the queries you run, rather than the rows they return? Create the queries as either views, or better yet build views and call them from stored procedures that can accept where clauses, topN parameters, etc.? That approach would mean you don't have to manage tables, or entire databases, and instead only manage views and stored procedures. That, by the way, is a use of sql server that is more in line with how it was designed. One approach might be to have some base views, and a table that might look like: tblOrders Customer RequestDate BaseView WhereClause TopClause Etc. Some sample rows might be: Abc Company 3/1/08 Column1, Column2 FROM dbo.vwSomeBaseView WHERE BLAH = 'Blah' AND BLAHBLAH= ' BlahB%' Top 3000 Abc Company 3/12/08 Column1, Column2, Column3 FROM dbo.vwSomeOtherBaseView WHERE Humbug = 'Blah' AND Yada= ' BlahB%' Top 10000 Using this approach, you only need to build a smaller number of views, you keep a record of who asked for what, when, and you can recreate the actual query very simply. A stored procedure to do that might look like: CREATE PROCEDURE usp_GetResultSet @Customer VARCHAR(50) , at Date DATETIME AS DECLARE @strSQL VARCHAR(500) DECLARE @Top VARCHAR(50) SELECT @strSQL= 'SELECT ' + ISNULL(TopClause,'') + BaseView + WhereClause FROM tblOrders WHERE Customer=@Customer AND RequestDate=@Date PRINT @strSQL --just for debugging... EXEC(@strSQL) Calling this procedure as usp_GetResultSet 'Abc Company','3/12/08' would give you: SELECT Top 10000 Column1, Column2, Column3 FROM dbo.vwSomeOtherBaseView WHERE Humbug = 'Blah' AND Yada= ' BlahB%' Voila, no messy storing of new tables, etc. I dunno, maybe that helps keep things tidier? The contents of this e-mail message and its attachments are covered by the Electronic Communications Privacy Act (18 U.S.C. 2510-2521) and are intended solely for the addressee(s) hereof. If you are not the named recipient, or the employee or agent responsible for delivering the message to the intended recipient, or if this message has been addressed to you in error, you are directed not to read, disclose, reproduce, distribute, disseminate or otherwise use this transmission. If you have received this communication in error, please notify us immediately by return e-mail or by telephone, 530-893-3520, and delete and/or destroy all copies of the message immediately. _______________________________________________ dba-SQLServer mailing list dba-SQLServer at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/dba-sqlserver http://www.databaseadvisors.com