Gustav Brock
Gustav at cactus.dk
Tue Mar 8 03:18:56 CST 2005
Hi John Thank you for bringing this. It's a good reminder that sometimes a temp table _is_ the choice and not just a quick work-around. /gustav >>> jwcolby at colbyconsulting.com 07-03-2005 19:34:47 >>> Well, I learned something new today. I have designed a data import system that pulls data from csv files into a raw data table. This data has a pair of IDs, a state ID and a Discipline ID which are FKs for a pair of tables. My live data table has these same fields. One of the objectives of this system is to compare personal data coming from state licensing boards to the personal data in our system. In order to do this I need to "narrow down" the live data and somehow pull a set of data from the "live data table" (our data) that corresponds one to one the "Raw data table" (the imported data). Originally I started by simply joining the STID, DIID and License Number from live to raw. It took FOREVER (as in 15 or 20 minutes) to return a data set when the live data has ~150K records and the raw data has ~5-10K records. While this result set was slow, it would allow changes made to the live data to be seen in the result set immediately. In search of something faster I discovered that if I ran a distinct query on the raw data pulling just the two ID fields and saved that as a query, I could then join THAT ID pair to the Live data table and pull a set of "potential matches" in less than 10 seconds. IOW, I would get all of the live data records that had the same state and discipline ID. I could then join the raw data to that data set on the license number and have a much faster result set that displayed the fields in live and the matching fields in raw. The end result set is fed to a form with a small subset of the fields displayed side by side (last name raw/live, first name raw/live etc) and then a class would decide if the two controls displayed different data and change the background color of the pair of controls if they were different to highlight that the data differed for those fields. All of which worked great. The problem is that if the data in live is changed to match raw (data coming from the state - address change for example) the change no longer displays in the form. Sigh. What I discovered today is that the "non update" is caused by the DISTINCT query, or a similar GroupedBy query. If I use either one of those two methods to select the set of live "potential matches" then the ultimate query will NOT display the changes made in the live data. However if I save that DISTINCT data (the state/disc IDs) in a table, then join the new table to live and raw, the resulting data set will display the changes in the live data table. In both cases (using the query directly or the saved data from the query) the resulting data set of live/raw is non-updateable, but in one case (using the saved ID data) changes in the underlying tables will be displayed in the result set, whereas if I just use the DSTINCT (or GROUPED BY) query the result set will never display changes in the underlying tables. So something about using a distinct or groupby causes the dataset to not display changes. I always knew the result sets were non-updateable but I was never aware that they wouldn't show updates either. John W. Colby www.ColbyConsulting.com