[AccessD] Find First in an Array?

jwcolby jwcolby at colbyconsulting.com
Fri Feb 20 13:02:48 CST 2009


Jim,

 >  So if your not comfortable with it then just ignore it? Humm.

I was making a general observation, which I believe is true.  People tend to use the tools they know 
and are expert with, and only go outside of their comfort zone when the tool fails them.  I freely 
acknowledge that this is true for me, and I am not accusing you of anything.

 >   No, that is not correct.  A seek on an open table won't cache the entire recordset in of 
itself.  What might be sitting in JET's cache is a few buffers worth of index pages and a couple of 
buffers worth of data pages, but I'd highly doubt you'd have the entire thing unless you just went 
through every single record or used them a lot in comparison to everything else (JET's cache is LRU 
based, so the pages could add up in the cache over time).

By definition you have to have all of the data that is used for doing any particular form.

 >Since a language table would be local to the FE, this would not be a pull over the wire and it 
would be quite fast.

It might be in your systems but it would not be in mine.  I use a BE because that is where data 
goes.  I personally do not have ANY local tables in the FE except for temp tables for processes.  I 
have many cases where there are different FEs for different purposes.  I am not going to get into 
the "this is a local table so it goes in the FE... oh damn, now I gotta go update the data in 5 
different FEs".  BEs are for data (in my world).

 >It's quite possible to that the disk drive would still have the data in its own cache, so you 
might not even end up doing a hard hit to the disk at all.

Unlikely.  The disk is also pulling forms, reports, running the browser as the user goofs off, 
working on buffering music that the user is listening to, and any of the other million things that 
go on in a Windows environment.  The probability that the data for a form opened sometime previously 
would still be in the disk buffer is pretty low I would guess.

 >   But with the seek approach, you would not be tying up a chunk of memory to be used for only one 
purpose.

I have to say that with Windows XP requiring 256 megs just for it to load, and Vista pushing twice 
that just for it to load, I am not worrying about caching my "seldom changing / frequently used" 
data.  If I take even a few hundred K to cache the language strings for every form in the system, I 
will not lose any sleep at all for doing that.  I would in fact just cache the form's opened...

 > It is when considering that point that I would not choose to do an array based approach either, 
which would be really simple to write as I could use GetRows() to dump the set into an array. Since 
the array would already be sorted, I could even do a B-tree search on it rather then a sequential 
one. That might even beat out a class (hard to say with VBA) in terms of speed.

Even if I used a Getrows() to dump into an array I would promptly move the data to a collection and 
get rid of the array.

I think that a lot of stuff goes into determining when to use what tools.  I do not have a b-tree 
search algorithm for a generic array but even if I did (and assuming that it was faster, doubtful), 
a collection is USUALLY fast enough that it wouldn't be worth the complexity of using such a tool, 
at least not on a routine basis.

I use collections because:

1) They are DEAD simple.  The syntax is simple, the usage is simple.
2) It is easy code to read and maintain.
3) They can hold a ton of stuff easily.
4) They can hold just about anything, from simple variables to objects and class instances (which 
are objects).
5) They are very fast.

If you think about it, Access uses collections EVERYWHERE, and I want to emphasize the EVERYWHERE. 
Every single thing that you do with Access from forms (a collection of) to controls on forms (a 
collection of), to properties of controls and forms (a collection of), to fields (a collection of), 
to querydefs (a collection of) etc etc ad nasium, are all stored internally to Access (and when 
loaded into memory) as collections.  Notice that Microsoft doesn't store all these items in tables. 
  If tables are so blazing fast, if all the things that you think are true (jet cache / disk cache 
etc) really are true, why does Access store everything in  collections instead of just leaving it in 
records on a disk?  Can you say programming / overhead nightmare?

Microsoft has spent a lot of time and energy optimizing collections in order to make Access itself 
fast.  IMHO, why would I NOT use what Microsoft uses and has spent so much effort to make fast and 
easy to use?


John W. Colby
www.ColbyConsulting.com


Jim Dettman wrote:
> John,
> 
> <<The bottom line is that to a large extent we all do what is comfortable to
> us as individual 
> programmers and to heck with ....>>
> 
>   So if your not comfortable with it then just ignore it? Humm.
> 
> <<Well...  Hmmm... your way caches the entire recordset and all its
> associated junk - indexes etc.  My 
> way caches exactly and only the strings for one specific language.>>
> 
>   No, that is not correct.  A seek on an open table won't cache the entire
> recordset in of itself.  What might be sitting in JET's cache is a few
> buffers worth of index pages and a couple of buffers worth of data pages,
> but I'd highly doubt you'd have the entire thing unless you just went
> through every single record or used them a lot in comparison to everything
> else (JET's cache is LRU based, so the pages could add up in the cache over
> time). 
> 
>   But even if nothing is in the cache, a seek is based on an index and since
> JET indexes are B-tree based, you'd have your record within 2-3 disk hits at
> most, especially on a 2500 record table.  Since a language table would be
> local to the FE, this would not be a pull over the wire and it would be
> quite fast.  It's quite possible to that the disk drive would still have the
> data in its own cache, so you might not even end up doing a hard hit to the
> disk at all.
> 
>   But if the data does happen to be in the JET cache, then I doubt there
> would be much of a difference between the two methods. I think the class
> approach would still be a tad faster, but not by much.  And let me be clear;
> it's not the class approach that I don't agree with but the point that as an
> application developer do I cache things in memory on my own or not?
> 
>   It is when considering that point that I would not choose to do an array
> based approach either, which would be really simple to write as I could use
> GetRows() to dump the set into an array. Since the array would already be
> sorted, I could even do a B-tree search on it rather then a sequential one.
> That might even beat out a class (hard to say with VBA) in terms of speed.
> 
>   But with the seek approach, you would not be tying up a chunk of memory to
> be used for only one purpose.
> 
>   From my viewpoint the seek approach is the best of both worlds, stuff gets
> cached if it is used a lot and if it is not, then it is not cached.  In that
> case, the memory gets used for something productive rather then sitting
> there.  And I would be really surprised if performance wasn't acceptable.
> 
> Jim.
>  
> 
> -----Original Message-----
> From: accessd-bounces at databaseadvisors.com
> [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of jwcolby
> Sent: Friday, February 20, 2009 9:44 AM
> To: Access Developers discussion and problem solving
> Subject: Re: [AccessD] Find First in an Array?
> 
> The bottom line is that to a large extent we all do what is comfortable to
> us as individual 
> programmers and to heck with ....
> 
>  >   You mentioned that JET always goes across the wire and that's not true.
> If a page is in the 
> local JET cache, then it won't.
> 
>  >   But if every programmer were to approach application design with the
> "load it into memory" 
> approach because it's fast, then very quickly you can find that the OS will
> start paging out to 
> disk.  So your really not in memory any more.
> 
> Well...  Hmmm... your way caches the entire recordset and all its associated
> junk - indexes etc.  My 
> way caches exactly and only the strings for one specific language.
> 
> However your way (done on a form by form basis on demand) sounds like it
> could be flushing the jet 
> cache as other processes run so that it may very well have to hit the disk
> again for the next form 
> to load.  At best it kind of sounds like "six of one, half a dozen of the
> other".
> 
>  > If I have an application that is a couple of hundred forms and user Jim D
> uses only one form, 
> then that's a big waste.
> 
> True, but if you cache the data as the form loads the first time then that
> goes away.  I was 
> proposing exactly that.  Now you have the best of both worlds, you don't use
> time and resources 
> until a form is actually loaded, but once it is loaded it can be loaded
> twice or a million times and 
> always be fast.  You should know from my JIT subforms I am all about doing
> stuff as / when needed.
> 
> Notice that I am not caching constantly changing data, nor data that doesn't
> change but is rarely 
> used, only data that rarely changes and is used constantly in the program.
> That "rarely changes / 
> constantly used" is not a rare occasion in complex systems.  To go to the
> disk dozens or hundreds or 
> thousands of times a day to get the same data over and over IMHO is just
> silly, when I can just 
> cache it and be done.  It is even sillier when caching it is a trivial
> programming exercise.  It is 
> sillier yet when caching the data significantly speeds up the program.
> 
> I learned a long time ago to organize my program into classes, each class
> performs a systemic task. 
>   I learned that when a set of data is in constant use in the program (and
> form translation fits 
> that bill) then I would build a class system to cache it and the program
> always works faster.  The 
> "cache class system" looks very similar from data set to data set.  It
> doesn't take me long to set 
> it up because I have done it so often.
> 
> So that is what I do.  I do understand that it is only fast / efficient for
> me because I use classes 
> all day every day, but so can any Access programmer as my lectures are
> intended to show.
> 
> 
> John W. Colby
> www.ColbyConsulting.com
> 
> 
> Jim Dettman wrote:
>> John,
>>
>>   I posted a couple of comments yesterday to you and Drew, which still
>> haven't shown up on the list, but I'll answer the question you just asked
>> with "It depends on how big the collection is".
>>
>>   What I wrote yesterday is that I would use a global recordset variable,
>> thus avoiding opening/closing the recordset repeatedly and I'd use seek.
>>
>>   You mentioned that JET always goes across the wire and that's not true.
>> If a page is in the local JET cache, then it won't.
>>
>>   But here that is really not an issue because a language translation
> table
>> is certainly going to be part of the front end, so the table will be
> local.
>> It's not going to change unless the app changes.
>>
>>   Rocky has been talking about pulling 2500 records; if all that is pulled
>> into memory, that is a fair sized chunk.  I'd much rather let the system
> use
>> that memory as it sees fit rather then tying it up with one specific task.
>> If I have an application that is a couple of hundred forms and user Jim D
>> uses only one form, then that's a big waste.  Of course you could mitigate
>> that somewhat by only loading the translation when the form loads and for
>> the specific language as you mentioned.
>>
>>   But if every programmer were to approach application design with the
> "load
>> it into memory" approach because it's fast, then very quickly you can find
>> that the OS will start paging out to disk.  So your really not in memory
> any
>> more.
>>
>>   That's why I think pulling something into memory like this is a waste.
>>
>> FWIW,
>> Jim.




More information about the AccessD mailing list