jwcolby
jwcolby at colbyconsulting.com
Tue Nov 3 21:12:23 CST 2009
I have processes that I run which use sets of stored procedures. For example two of the processes export / import records from large tables for address validation. The addresses need to be sorted in Zip5/Zip4/Address order in order to make the external validation process as fast as possible so I create a table that pulls millions of records, often tens of millions of records, into a table (created on-the-fly) in sorted order, then pulls out 2 million records at a time and BCPs them out for processing in an external program. The inverse occurs when the records finish processing, 2 million record chunks are imported back into a temp table, then each 2 million record chunk is appended to a main table. Once back in SQL Server I have processes that build clustered and non-clustered indexes, run sha-1 hash functions to generate hash keys, update address valid codes etc. I will run this entire project at least once a month (to keep the addresses valid as people move), and I do so for many tables - a table of about 65 million addresses, another of 21 million, another of 11 million etc. I store all of my stored procedures in a dedicated database that pretty much just contains the SPs and a few UDFs as well as a couple of logging type tables. At the moment I use an Access database to run the individual stored procedures in specific order. I am just embarking on a project to move that functioning but not very efficient Access database to C#. I want to be able to have better monitoring of SP completion, records processed (counts), time to complete each SP etc. I found a pretty nice set of C# classes that allow creation of "stored procedure" objects, with parameter objects, then stored into a collection as the SPs (class instances) are assembled and finally the whole collection of SPs executed. I created a test database of a million names / addresses and today I started testing my existing SPs on this small data set. In the next few days I expect to build out a preliminary "port" of the existing Access VBA code. I am hoping to eventually make the entire process "table driven" where I can store (in a table) the name of the sp, the parameters in a child table etc. and then have a supervisor pull out the records and execute the SPs based on what is in the table. This would allow me to create the SPs and then use a C# application (data entry forms) to organize the SPs, add / delete SPs from the process and so forth. That is down the road of course, for now I will have hard coded sequences - as I already do in supervisor SPs as well as in hard coded Access VBA modules. As a programmer, doing this supervisor level in C# feels more natural to me (than doing it inside of SQL Server), and gives me the flexibility to do what I want in code external to the SQL Server database being manipulated. I want to be able to do things external to the database processes, things like monitor the directories that the BCP processes export to / from, move files popping out of a bcp process off to a VM on a different server, monitor those VMs for files coming out of their processes, moving the files back over to the SQL Server directories, and trigger BCP processes that import the data back in to SQL server. I need to be able to FTP order and data files to clients, and get (FTP) files from process bureaus. So it FEELS to me like using a language such as C# to build a system outside of SQL Server is the way to go. BTW I stumbled across something called SMO (SQL Server Management Objects) which makes it really easy to get at the SQL Server object model from C#. I can apparently see the entire tree structure that is a server/database/tables/etc. and do I know not what (yet) with those objects. Looks pretty cool though. The database is a large part of what I do but it is not the whole picture and I have been handcuffed by my lack of ability to effectively do these external parts. But I am a programmer at heart. I want to know how you do this kind of stuff. I know that running such "systems" of stored procedures tied to external processes must be quite common and I am wondering what you guys do for supervising your processes. If anyone doing anything remotely similar would take the time to explain how you do it, I would be most appreciative. Thanks, -- John W. Colby www.ColbyConsulting.com