jwcolby
jwcolby at colbyconsulting.com
Wed May 19 22:25:26 CDT 2010
Francisco, Thanks for the response. I keep my large "source data" databases segregated for a couple of reasons. 1) The tables are largish - tens of millions of records. I have tables with as few as 11 million records clear up to 70 million records. The database files themselves run from 5 GIGS up to hundreds of gigabytes. 2) The hardware I can afford makes dealing with the separated data doable, whereas a merged database scenario would quickly get out of hand. 3)My client (the owner of the data) tends to use just one or two of the databases for any given order. However it is often the case that I will select data where I join the tables in two or occasionally more databases (source tables) together. IOW Women with children ages 0-3 years or 4-7 years AND has a dog. The Children database (20 million records) comes from one source, the cat_dog table (11 million records) from another source. So I have about a dozen of these "source tables" and the client can ask me to "join" any given set of them. I actually do the join on a hash value in each table, where I have hashed the name and address fields and store the hashes. Thus the hash values are identical between the two tables where the name / address info are identical. The hash values are just a shortcut, I could accomplish the same thing using multi-field joins but that is a PITA. Anyway... I drive all of this from C#. I am developing a largish application in C# which executes the stored procedures for any given process. I have an Address validation export process, address validation input process, order fixed width file export process and some others. Because the address validation export / import process is rigorously standardized (by me), I can validate ALL of my "source data" tables using the exact same code. Additionally I can address validate the order data tables with that same process. Basically I just use the same set of out / in stored procedures for each and every database I touch. All driven by my custom C# program. The stored procedures allow me to define the database / table (or view) name to export / import and each SP then has an error code / error message / result code / number of records effected (where applicable). Thus my C# app can monitor for errors on the SQL Server side, watch for number of records effected and so forth and perform error handling / logging up at the application level. Likewise I can log results, dates / times to perform the stored procedure, and so forth. It is slooooowwwly becoming "point and click". The question about dynamic SQL really comes from the fact that I am about to embark on a rewrite of "my" SQL book: http://www.amazon.com/Beginning-SQL-Programmer-Paul-Wilton/dp/0764577328/ref=sr_1_2?ie=UTF8&s=books&qid=1274325300&sr=1-2 I use dynamic SQL a lot, and want to do a chapter on it, but at the same time I want to capture the reality of best practice which apparently is "use it judiciously and with safeguards". I certainly do NOT use it judiciously, I use it everywhere and I have NO safeguards. My specific case allows me to get away with that but I want to do the right thing in the book. Basically I don't know what the right thing is, how to do the safeguards. So this thread is about collecting opinions and methods for doing stuff that is complex enough that dynamic SQL might make sense, but doing it the right way. John W. Colby www.ColbyConsulting.com Francisco Tapia wrote: > Excellent description John, when I get a bit of time I'll write up > some sudo code on how to do what you do but a bit more standardized. > As it is having all your sprocs (stored procedured) in a central > master database certainly helps with maintenance. > > I saw Shamil's reply and think that partitioned views could be a > useful tool but really you are already segragating each dataset in > each database. > > In your situation as I mentioned in the past might be a good passable > reason why dynamic sql works. Its not always the case as most systems > have user interfaces and any one of those guys could wreck your data. > > Some user defined functions to ensure parameters are scrubbed might > prove useful to you to help strip harmful dynamic syntax. You > mentioned that you have standardized fields and tables and you quite > possibly have all of this worked out already. As I said I'll comment > with more details when I get more time.