Francisco Tapia
fhtapia at gmail.com
Thu May 20 10:17:06 CDT 2010
I can see where your getting at and I'll tell you, there are various sources on the 'net that will all state that dynamic sql is close to being evil... not quite.. but close enough ;). I think in general the problem is sanitizing your database inputs ( http://imgs.xkcd.com/comics/exploits_of_a_mom.png ) I'll explain a bit further... in many situations because we are programmers we like to break things down to re-usable code... and there is nothing re-usable when you have a new view for each table hardcoded. BUT... there are other things to consider when you are programming in SQL Server that lend themselves more towards performance than re-usable code. Obviously i'm not stating that you shouldn't aim for re-usable code... but simply that there are situations when you need to consider the cached performance by creating separate views and stored procedures that provide faster returns of data to give your users the illusion of uber faster system performance. It certainly possible to re-use a dynamic sql query plan instead of having to generate a new one from scratch everytime... to do this you'd change your EXEC (@sql) statment to something more like that of the syntax for sp_executesql which actually allows you to re-use your dynamic sql with parameters, using EXEC will force the engine to generate a new query plan every time, causing a performance hit. a major caveat that you need to inform your users of, because "GENERALLY" you'd program access to your database via Views, Stored Procedures and Functions, limiting any access to tables. This maybe something that depends on the project itself as this is in no way a hard set rule, but one that I like to follow as close as I can. In many instances it's not necessary for your users to have direct access to a table... that is not to say that giving SELECT access to a table because of a dynamic sql is not ok, it most certainly is, but depending on the interface you could be exposing more than you bargain for in these situations... FOR instance you've generated select rights to a table but did not limit the exclusions on your security rights... then your users could accidentally look at data that is not privy to them... I had to do something similar not too long ago when we generated an encrypted storage of Credit Card #'s, because of the sensitivity I designed the security to not allow display access of the credit card field to any of the users who were pursuing the database table. Instead access was limited via stored procedures and views that prevented users from gaining any type of access. I also limited my assistant DBA from gaining access to the database by simply giving him access to backup the databases on that specific server, preventing an accidental leak of data. I did go overboard I guess a bit on the security for that system but it contains a lot of sensitive information, and I would not trust dynamic sql to provide me all the security that I wanted especially if for some reason we were hacked internally. I do have more points and I'll add them on as I get more time to discuss... -Francisco http://sqlthis.blogspot.com | Tsql and More... On Wed, May 19, 2010 at 8:25 PM, jwcolby <jwcolby at colbyconsulting.com>wrote: > Francisco, > > Thanks for the response. > > I keep my large "source data" databases segregated for a couple of reasons. > > 1) The tables are largish - tens of millions of records. I have tables > with as few as 11 million > records clear up to 70 million records. The database files themselves run > from 5 GIGS up to > hundreds of gigabytes. > 2) The hardware I can afford makes dealing with the separated data doable, > whereas a merged database > scenario would quickly get out of hand. > 3)My client (the owner of the data) tends to use just one or two of the > databases for any given > order. However it is often the case that I will select data where I join > the tables in two or > occasionally more databases (source tables) together. > > IOW Women with children ages 0-3 years or 4-7 years AND has a dog. The > Children database (20 > million records) comes from one source, the cat_dog table (11 million > records) from another source. > So I have about a dozen of these "source tables" and the client can ask > me to "join" any given > set of them. > > I actually do the join on a hash value in each table, where I have hashed > the name and address > fields and store the hashes. Thus the hash values are identical between > the two tables where the > name / address info are identical. The hash values are just a shortcut, I > could accomplish the same > thing using multi-field joins but that is a PITA. > > Anyway... > > I drive all of this from C#. I am developing a largish application in C# > which executes the stored > procedures for any given process. I have an Address validation export > process, address validation > input process, order fixed width file export process and some others. > Because the address > validation export / import process is rigorously standardized (by me), I > can validate ALL of my > "source data" tables using the exact same code. Additionally I can address > validate the order data > tables with that same process. Basically I just use the same set of out / > in stored procedures for > each and every database I touch. All driven by my custom C# program. > > The stored procedures allow me to define the database / table (or view) > name to export / import and > each SP then has an error code / error message / result code / number of > records effected (where > applicable). Thus my C# app can monitor for errors on the SQL Server side, > watch for number of > records effected and so forth and perform error handling / logging up at > the application level. > Likewise I can log results, dates / times to perform the stored procedure, > and so forth. > > It is slooooowwwly becoming "point and click". > > The question about dynamic SQL really comes from the fact that I am about > to embark on a rewrite of > "my" SQL book: > > > http://www.amazon.com/Beginning-SQL-Programmer-Paul-Wilton/dp/0764577328/ref=sr_1_2?ie=UTF8&s=books&qid=1274325300&sr=1-2 > > I use dynamic SQL a lot, and want to do a chapter on it, but at the same > time I want to capture the > reality of best practice which apparently is "use it judiciously and with > safeguards". I certainly > do NOT use it judiciously, I use it everywhere and I have NO safeguards. > My specific case allows me > to get away with that but I want to do the right thing in the book. > > Basically I don't know what the right thing is, how to do the safeguards. > So this thread is about > collecting opinions and methods for doing stuff that is complex enough that > dynamic SQL might make > sense, but doing it the right way. > > John W. Colby > www.ColbyConsulting.com > > > Francisco Tapia wrote: > > Excellent description John, when I get a bit of time I'll write up > > some sudo code on how to do what you do but a bit more standardized. > > As it is having all your sprocs (stored procedured) in a central > > master database certainly helps with maintenance. > > > > I saw Shamil's reply and think that partitioned views could be a > > useful tool but really you are already segragating each dataset in > > each database. > > > > In your situation as I mentioned in the past might be a good passable > > reason why dynamic sql works. Its not always the case as most systems > > have user interfaces and any one of those guys could wreck your data. > > > > Some user defined functions to ensure parameters are scrubbed might > > prove useful to you to help strip harmful dynamic syntax. You > > mentioned that you have standardized fields and tables and you quite > > possibly have all of this worked out already. As I said I'll comment > > with more details when I get more time. > > _______________________________________________ > dba-SQLServer mailing list > dba-SQLServer at databaseadvisors.com > http://databaseadvisors.com/mailman/listinfo/dba-sqlserver > http://www.databaseadvisors.com > >