[AccessD] Optimizing subqueries

Drew Wutka DWUTKA at Marlow.com
Mon Jun 2 18:56:08 CDT 2008


Just thought I'd share something I ran into today.
 
I use subqueries a lot.  I am putting the finishing touches on that
'online form' system I mentioned a while back.  Last week, I finished
the 'search' system.  To give a little background, this form system
allows the user to create their own forms (creating their own fields for
the form, etc.).  These fields are actually defined within the
tables....not like fields in a table, but data in the tables.  ie, I
have a tblFormFields table, which defines the forms ID, Name, Type
(which implies data type), ExtraInfo, etc.  I then have a handful of
data tables, which hold the data. (Each table is for a specific data
type, so the field type determines which table the data is stored in).
The data is stored with a DataEntryID (which represents another table,
recording when and who is entering the data (also allows for roll back
capability)).  On top of that, I have a CurrentData table, which holds
the DataEntryID and FieldID of the most current data for a Form Entry.
Ya, ok, that all may sound complex.  It is, and so building a search
routine that queries that structure was a lot of fun, though not as
difficult as I initially thought.  Initial development testing was only
done against a handful of records though..and today I am 'load testing'
the system.  I dumped ~16,000 form entries into the system.  Everything
was working great, EXCEPT the search feature.  Running a search on that
many records was taking FOREVER (I was killing the process after about
60 seconds).
 
The query that was I began troubleshooting this with should return 19
records.  There are two criteria (one that returns about 280 records,
and one that returns about 1500 records, the mix of the two should
return 19 records).  The query displays a handful of fields from one
table (tblFormEntries) which shows the basic info on a form entry (who
created the form, when it was created, what form it is form, etc.),
along with a few joins to show extra information (like linking the
UserID for the form creator to the tblUser table to get their name), and
one join for criteria (tying to the tblRootForms table (because this
system allows for multiple form revisions)).  So that initial part of
the query would return ~16k records.  I then had two subqueries in the
WHERE statement (the one returning 280 records, and the one returning
~1500 records), which tied to the main portion through the FormEntryID.
Running the query like that, as I said, was taking forever, longer then
I was willing to wait.
 
However, running the subqueries on their own were pretty much
instantaneous.  So how do I get the Primary query to 'run them first',
so that it isn't trying each subquery 16,000 times for each
subquery........
 
A join statement!
 
I'm sure I've read about doing this somewhere, at sometime, but I had
never actually tried it.
 
Sure enough, by taking the subqueries OUT of the WHERE statement, and
putting them in the FROM clause, whalla, the data I needed was pulled
instantenously (which shows promise, since I plan on putting about 160k
form entries in for the full load testing!).
 
So, to give an example, let's say we have this table:
 
tblPrimaryData
PrimaryKey
MorePrimaryData
 
and this table:
 
tblExtraData
PrimaryKey
SomeOtherKey
SomeCriteriaField
 
Now let's say that tblPrimaryData has a million records, and
tblExtraData has a few hundred thousand.  Let's say that we want a
specific SomeOtherKey and specific SomeCriteriaField, and that produces
20 records (and due to proper indexes is instantaneous).
 
We could write our SQL Statement as such:
 
SELECT tblPrimaryData.PrimaryKey, MorePrimaryData
FROM (SELECT PrimaryKey FROM tblExtraData WHERE SomeOtherKey=1 AND
SomeCriteriaField="Hello World") AS C1 INNER JOIN tblPrimaryData ON
C1.PrimaryKey=tblPrimaryData.PrimaryKey
 
And Whalla, Jet is going to run that subquery first, producing the 20
records based on the subqueries criteria, and then it is ONLY going to
pull up those 20 records from tblPrimaryData.
 
Obviously the example above could be done with a regular join.  But the
principle is the same (My subqueries link 2 or 3 tables together to get
their results, so they had to be subqueries).  You can even then use
criteria from the subqueries (WHERE C1.AnotherField=3 (that field would
have to be in the subquery), and something else I thought was neat was
that Access lets you see a query like that in the Query Builder (it
lists the fields that are in the subquery, and would display it's name
as C1).
 
Anywho, just thought I'd share!
 
Drew
 
 

The information contained in this transmission is intended only for the person or entity to which it is addressed and may contain II-VI Proprietary and/or II-VI Business Sensitive material. If you are not the intended recipient, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. You are notified that any review, retransmission, copying, disclosure, dissemination, or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited.



More information about the AccessD mailing list