[AccessD] Optimizing subqueries

Gustav Brock Gustav at cactus.dk
Tue Jun 3 01:28:09 CDT 2008


Hi Drew

Good example.
At least two variations exist.

One variation on this in Access is to move the subquery "tables" to separate queries. In most cases - also yours - you already have created and tested the subquery on its own to verify that the output is as expected. Then leave that as a saved query, here it could by qdyExtraData:

  SELECT 
    PrimaryKey 
  FROM 
    tblExtraData 
  WHERE 
    SomeOtherKey=1 
    AND
    SomeCriteriaField="Hello World"

and the main query will now read:

  SELECT 
    tblPrimaryData.PrimaryKey, 
    MorePrimaryData
  FROM 
    qdyExtraData
  INNER JOIN 
    tblPrimaryData ON
      qdyExtraData.PrimaryKey=tblPrimaryData.PrimaryKey

The advantage is that it is much easier to read and debug and that the "subquery" can be reused in other queries.

The other variation is that from the "Old SQL School": Temporary tables. In some cases - indeed for update and delete queries - this is so much faster that it is hard to believe but many Access developers have no tradition for this technique.
In your case you would write the output of the helper query to the temp table, tblExtraData, and then use this as a source:

  SELECT 
    tblPrimaryData.PrimaryKey, 
    MorePrimaryData
  FROM 
    tblExtraData
  INNER JOIN 
    tblPrimaryData ON
      tblExtraData.PrimaryKey=tblPrimaryData.PrimaryKey

/gustav


>>> DWUTKA at marlow.com 03-06-2008 01:56 >>>
Just thought I'd share something I ran into today.
 
I use subqueries a lot.  I am putting the finishing touches on that
'online form' system I mentioned a while back.  Last week, I finished
the 'search' system.  To give a little background, this form system
allows the user to create their own forms (creating their own fields for
the form, etc.).  These fields are actually defined within the
tables....not like fields in a table, but data in the tables.  ie, I
have a tblFormFields table, which defines the forms ID, Name, Type
(which implies data type), ExtraInfo, etc.  I then have a handful of
data tables, which hold the data. (Each table is for a specific data
type, so the field type determines which table the data is stored in).
The data is stored with a DataEntryID (which represents another table,
recording when and who is entering the data (also allows for roll back
capability)).  On top of that, I have a CurrentData table, which holds
the DataEntryID and FieldID of the most current data for a Form Entry.
Ya, ok, that all may sound complex.  It is, and so building a search
routine that queries that structure was a lot of fun, though not as
difficult as I initially thought.  Initial development testing was only
done against a handful of records though..and today I am 'load testing'
the system.  I dumped ~16,000 form entries into the system.  Everything
was working great, EXCEPT the search feature.  Running a search on that
many records was taking FOREVER (I was killing the process after about
60 seconds).
 
The query that was I began troubleshooting this with should return 19
records.  There are two criteria (one that returns about 280 records,
and one that returns about 1500 records, the mix of the two should
return 19 records).  The query displays a handful of fields from one
table (tblFormEntries) which shows the basic info on a form entry (who
created the form, when it was created, what form it is form, etc.),
along with a few joins to show extra information (like linking the
UserID for the form creator to the tblUser table to get their name), and
one join for criteria (tying to the tblRootForms table (because this
system allows for multiple form revisions)).  So that initial part of
the query would return ~16k records.  I then had two subqueries in the
WHERE statement (the one returning 280 records, and the one returning
~1500 records), which tied to the main portion through the FormEntryID.
Running the query like that, as I said, was taking forever, longer then
I was willing to wait.
 
However, running the subqueries on their own were pretty much
instantaneous.  So how do I get the Primary query to 'run them first',
so that it isn't trying each subquery 16,000 times for each
subquery........
 
A join statement!
 
I'm sure I've read about doing this somewhere, at sometime, but I had
never actually tried it.
 
Sure enough, by taking the subqueries OUT of the WHERE statement, and
putting them in the FROM clause, whalla, the data I needed was pulled
instantenously (which shows promise, since I plan on putting about 160k
form entries in for the full load testing!).
 
So, to give an example, let's say we have this table:
 
tblPrimaryData
PrimaryKey
MorePrimaryData
 
and this table:
 
tblExtraData
PrimaryKey
SomeOtherKey
SomeCriteriaField
 
Now let's say that tblPrimaryData has a million records, and
tblExtraData has a few hundred thousand.  Let's say that we want a
specific SomeOtherKey and specific SomeCriteriaField, and that produces
20 records (and due to proper indexes is instantaneous).
 
We could write our SQL Statement as such:
 
SELECT tblPrimaryData.PrimaryKey, MorePrimaryData
FROM (SELECT PrimaryKey FROM tblExtraData WHERE SomeOtherKey=1 AND
SomeCriteriaField="Hello World") AS C1 INNER JOIN tblPrimaryData ON
C1.PrimaryKey=tblPrimaryData.PrimaryKey
 
And Whalla, Jet is going to run that subquery first, producing the 20
records based on the subqueries criteria, and then it is ONLY going to
pull up those 20 records from tblPrimaryData.
 
Obviously the example above could be done with a regular join.  But the
principle is the same (My subqueries link 2 or 3 tables together to get
their results, so they had to be subqueries).  You can even then use
criteria from the subqueries (WHERE C1.AnotherField=3 (that field would
have to be in the subquery), and something else I thought was neat was
that Access lets you see a query like that in the Query Builder (it
lists the fields that are in the subquery, and would display it's name
as C1).
 
Anywho, just thought I'd share!
 
Drew




More information about the AccessD mailing list