[AccessD] fast checksum

Sat Feb 23 10:55:59 CST 2008

I do some caching of records in class structures so that I can get much
higher speed decision making that if I had to go get specific records out of
a new recordset every time I need it.  Let me give an example.

I have a lookup table called tlkpProductType.  The products are a dynamic
list, i.e. products are added all the time. 
I have a second lookup table called tlkpProcess.  The processes are a
dynamic list, i.e. processes can be added all the time.
I then have a many to many table tmmProductUsesProcess.  Basically this is
just a table that looks like:

PUP_ID	PUP_IDPRD	PUP_IDProc	Active
1			23		12	True
2			23		5	True
3			13		12	True

And so forth.  I actually fill in this table with an append where I pull the
Product table (id) in and the process type table (id) in and append to the
ProductUsesProcess table.  Because there is no join of any kind between the
two tables I get a record for each product for each process.  The user then
goes in and turns on the Active flag for specifically what product uses what
process by setting that PUP record active. 

I use this to dynamically turn on and off the Processes that can be run by a
specific product, and in fact my client can make these decisions on his own.
The processes are mine, things like "Print report XYZ" or "Display tab ABC",
the user can see the processes and a description of them and can then say
"yea, product XYZ uses Displays this tab in the main form".

It works well.

However because of the fact that there are (at this instant for this
example) 43 products and 53 processes, there are 43 x 53 records in
tmmProductUsesProcess.  Obviously these records (decisions) are used
EVERYWHERE in the program so I cache them in a pair of classes.  One
clsPUPSupervisor class pulls the PUP records and stores each record in a
clsPUP class, and stores the clsPUP instances in a collection.
clsPUPSupervisor then has methods to answer the question PUP(lngProductID,
lngProcessID) - is lngProductID used in the process represented by
lngProcessID by iterating through the clsPUP collection and examining the
data.  

That works swimmingly, I can get the answer in a VERY fast manner, typically
(on my specific system at the client) between .06 and .08 msec which is WAY
faster than pulling a query record and looking at that.

So, I am caching things, and this is just one example, I have a few other
similar types of structures where the data does not change often but it is
used a lot and would take a much longer time to pull out of a query every
time it is used.

Now the rub, the data can change.  In the example above it only changes when
I create new processes, and I can tell the users to get out and back in to
refresh the cache.  However I have always wanted a method for the cache
supervisor to be able to sense a change in the data that is cached and
rebuild the cache when needed.  I am looking at building a method of
chksumming the values in each ID field and storing the checksums, then be
able to go back later and checksum again and compare against the previous
checksums.  The cache supervisor would have to decide how often and when to
check the checksums.  Obviously if the checksum calcs take 30 ms and the
cache access takes only .06 ms you do not want to do the check too often!!

So I am looking for ideas on how to get a "checksum" without taking too much
time.  I am currently using a sum() of each ID which does what it does
pretty fact but a sum of the values is not a particularly rigorous check.  I
also just do a brute force count() and if the count of the records changes
then the cache needs to be refreshed.  Both of these are "OK" but I really
want something FAST but better than Sum().  If anyone has ideas I am
definitely listening.

John W. Colby
Colby Consulting
www.ColbyConsulting.com