jwcolby
jwcolby at colbyconsulting.com
Thu May 20 11:59:08 CDT 2010
David, > If I understand your first example correctly, you have database per client All of this stuff is for MY client (one client). My client gives me huge "lists" (as he calls them) of name / address info, tens of millions of names. I extract each "list" into its own database. So I have a "Children's" database, a "Dog owners" database, a "smokers" database, a ... Each list pretty much comes from a different source, they are purchased from list providers on the open market. I "normalize" them to have the same fields, with the same field names etc, plus a PK autonumber. I then address validate them and start to use them. >You extract subsets of records repeatedly from that large table. To do this you build a query (the view), which you use to populate a table (SubTable1). From that table you select a subset (SubTable1_a), and later you may select another subset (SubTable1_b). Each of these subsets you persist as physical tables. No. I build a view which may be a join of one or many of those "lists" as my client calls them. Once I have selected some records, I persist them, ALONG WITH the fields that are the selection criteria. THOSE SELECTION CRITERIA FIELDS change from order to order. Even the parent "list tables" change from order to order. > In this example, why build the physical table (SubTable1)) when the virtual table (the view) will do the same thing? Several reasons. 1) The real table is a subset of these list tables and is exactly the set of records which match some criteria FOR THAT ORDER. The original view can take minutes or even hours (OK a slight exaggeration but you get the point) to process and get the result set for. 2) The "View" is a join of several different tables, even of several different "lists" (which is why it can take a loooooonnnnng time to select the records that go in the real order table). 3) I have to mark the REAL order table with additional selection criteria. For example the client says "Within the selected records, I want to select X records where XYZ is true, Y records where ABC is true and Z records where JKL is true". In order to do that, I add three fields which I use just for this purpose. Additionally I have a "key" field which I end up putting a text string "key" in that is provided by the client for each of these orders. Obviously if I do not have a "real table" I cannot do that. I need to process subsequent orders against previous orders and not send the same names / addresses out again (sometimes). All of which brings up this point. What I am doing is extremely complex and cannot be adequately described in a single email. I do what I do because I am doing some pretty complicated processing for the client. I am not looking for advice on how to modify that process, if for no other reason than without a multi-day class on the subject, it is not possible for you to understand what these processes are never mind how to make them more efficient. I would LOVE to make them more efficient, but that (as they say) is "beyond the scope of this thread". In the end everyone will use dynamic SQL for their own reasons and their own processes. What is important is not a process redesign, but a "best practices" discussion on HOW to use dynamic SQL when it is required. John W. Colby www.ColbyConsulting.com David Lewis wrote: > Hi John: A few questions about how you do things: > > If I understand your first example correctly, you have database per client (is this one large table, or is it a full-blown relational db with many tables, etc.?). You extract subsets of records repeatedly from that large table. To do this you build a query (the view), which you use to populate a table (SubTable1). From that table you select a subset (SubTable1_a), and later you may select another subset (SubTable1_b). Each of these subsets you persist as physical tables. > In this example, why build the physical table (SubTable1)) when the virtual table (the view) will do the same thing? The subsets of the table (SubTable1_a and _b) that are actually sent on to the client can be thought of as 'WHERE xyz=123' clauses to the view that was originally used to populate SubTable1, no? If you kept a record of the where clauses, with other pertinent information, such as date, client, etc., you could reproduce the recordsets at will, and would not need to physically store the data, neither as SubTable1 or as its children _a or _b. > > I guess the essence of my question is why is it necessary to build a physical table, when you have a virtual table? > > d lewis