John B.
john at winhaven.net
Wed Oct 15 22:39:55 CDT 2003
LOL! Actually, I really appreciate these adventures in theory - although I would imagine by now the original poster of this thread no longer cares! Although most of these issues have no right or wrong, every now and again I do change my outlook on an issue because of these threads. (I did it with regards to RVBA a few a years ago.) This thread has me re-thinking my guideline on how to type data. I've appended the following exception: Type data field by the data's purpose. Numerical Type for numbers that are numerical values; Text Type for numbers that are item identifiers; Exception: if performance is an extremely important issue, then consider storing numerical strings in a numerically typed field. Thanks all! Its been another fun learning experience. John B. > -----Original Message----- > From: accessd-bounces at databaseadvisors.com > [mailto:accessd-bounces at databaseadvisors.com]On Behalf Of William > Hindman > Sent: Wednesday, October 15, 2003 3:05 PM > To: Access Developers discussion and problem solving > Subject: Re: [AccessD] Number vs text data type > > > "Why sacrifice any speed, to accomodate unknown future variables?" Drew > > ...said the code guru as he inserted two digit year dates into his new > program back in '83 :)))))) > > William Hindman > "The future will not belong to those who are cynical or > those who stand on the sidelines. The future will belong to those who > have passion and are willing to work hard to make our country better." > ...Paul Wellstone "The Conscience of a Liberal" > > > ----- Original Message ----- > From: "Drew Wutka" <DWUTKA at marlow.com> > To: "'Access Developers discussion and problem solving'" > <accessd at databaseadvisors.com> > Sent: Wednesday, October 15, 2003 3:11 PM > Subject: RE: [AccessD] Number vs text data type > > > > You are right, Zip Codes could change into anything. But that > applies to > > everything we know. Should we make EVERY field a string? You can only > code > > for the present, and predicted future. The post office isn't going to > > change the Zip Code system (at least not the basic 5 digits) without a > > decent amount of lead/warning time. They have systems they'd have to > modify > > also. > > > > Also, you are right, with today's processors, speed is not as critical. > But > > the question is, when faced with a factor of 1.5x to 2x faster, > do you go > > with the slower process, or the faster process? Isn't it > better to stick > > with the faster process? Why sacrifice any speed, to accomodate unknown > > future variables? > > > > Drew > > > > -----Original Message----- > > From: Heenan, Lambert [mailto:Lambert.Heenan at aig.com] > > Sent: Wednesday, October 15, 2003 1:47 PM > > To: 'Access Developers discussion and problem solving' > > Subject: RE: [AccessD] Number vs text data type > > > > > > Ah but... > > > > one day in the future the design of zip codes (which is only ONE example > of > > a "number" that is not a number) could well be changed to something > utterly > > different that looks nothing like a number. You can't just keep adding > > fields and complex put-it-all-back-together routines to reconstruct the > > original string from a combination of strings and numbers. Well you can, > but > > it might make life miserable as "the management" make more wacky > decisions. > > > > As for searching for string vs numbers. Here are some results > on a test Db > I > > threw together. > > > > Using a table with 581,672 records: each record has a longint > field and a > > string field. The long field is populated with random long > integer values, > > and the string field has the string representation of its corresponding > > integer field. Both fields are indexed with duplicates allowed. > > > > The table is in an MDB file (44.9 MBytes) which is on a network > drive, and > > the network is running at just 10-mbit. So then I ran the following code > to > > search for random 'numbers' both as numbers and as strings... > > > > Sub testSearch() > > Dim n As Long > > Dim s As String > > > > Dim TextSearchStart As Single, TextSearchEnd As Single > > Dim NumericSearchStart As Single, NumericSearchEnd As Single > > Dim TextSearchTime As Single > > Dim NumericSearchTime As Single > > Dim x As Long > > Dim strHits As Long > > Dim intHits As Long > > Dim db As Database > > Dim rs As Recordset > > Dim sCriteria As String > > Randomize > > Set rs = OpenDynaset(db, "tblHuge") > > For x = 1 To 1000 > > With rs > > n = CLng(Rnd() * 2 ^ 31) > > s = CStr(n) > > > > sCriteria = "nNumber=" & n > > NumericSearchStart = Timer > > .FindFirst sCriteria > > intHits = intHits + Abs(CLng(Not .NoMatch)) > > NumericSearchEnd = Timer > > NumericSearchTime = NumericSearchTime + (NumericSearchEnd - > > NumericSearchStart) > > sCriteria = "sText='" & s & "'" > > TextSearchStart = Timer > > .FindFirst sCriteria > > strHits = strHits + Abs(CLng(Not .NoMatch)) > > TextSearchEnd = Timer > > TextSearchTime = TextSearchTime + (TextSearchEnd - > > TextSearchStart) > > End With > > Next x > > rs.Close > > Debug.Print "Average find times" > > Debug.Print "Numbers:" & Format(NumericSearchTime / 1000, > > "0.0000000000") & " Hits:" & intHits > > Debug.Print "Strings:" & Format(TextSearchTime / 1000, > "0.0000000000") > & > > " Hits:" & strHits > > Set db = Nothing > > Set rs = Nothing > > End Sub > > > > > > And here are the results of a few runs, last to first... > > > > Average find times > > Numbers:0.0020859370 Hits:0 > > Strings:0.0025468750 Hits:0 > > Average find times > > Numbers:0.0023203120 Hits:0 > > Strings:0.0030234380 Hits:0 > > Average find times > > Numbers:0.0023984380 Hits:0 > > Strings:0.0040195310 Hits:0 > > Average find times > > Numbers:0.0029257810 Hits:0 > > Strings:0.0046445310 Hits:0 > > > > As you can see the random search values never generated any > hits - not too > > surprising considering the number of possible values in 32bit long. > > > > The numeric search times were fairly consistent, and the string search > times > > steadily dropped. I assume that there is some sort of caching going on > here. > > But the main point is that at worst it took just about twice as long to > > search for the string as it did to find the long. Not *such* as huge > > performance hit considering we are talking about thousandths of > a second. > > There is also a fair bit or arithmetic going on between each search. > > > > I goes without say (so why am I saying it?) that whne yo take out the > > indexes on the fields the search times go though the roof (for the Long > > field as well as the string one). > > > > Lambert > > > > > -----Original Message----- > > > From: Drew Wutka [SMTP:DWUTKA at marlow.com] > > > Sent: Wednesday, October 15, 2003 12:26 PM > > > To: 'Access Developers discussion and problem solving' > > > Subject: RE: [AccessD] Number vs text data type > > > > > > Ah, but you can always add fields to handle new anomalies! > With the zip > > > code, just add an 'extension' field. Original Number field is left > intact > > > (and all 'search' routines) would still work. > > > > > > Drew > > > > > > -----Original Message----- > > > From: Heenan, Lambert [mailto:Lambert.Heenan at aig.com] > > > Sent: Wednesday, October 15, 2003 8:38 AM > > > To: 'Access Developers discussion and problem solving' > > > Subject: RE: [AccessD] Number vs text data type > > > > > > > > > This thread has been going on quite some time now, so why should I let > it > > > stop :-) > > > > > > One key issue that which has not yet been mentioned but should be > > > considered > > > is is there any possibility that the design / structure of the > "numerical" > > > identifier might change? > > > > > > To take the example of US Zip codes. Once upon a time there > were no Zip > > > codes, then along came 5 digit zips, and a little later the > hyphen plus > > > four > > > digit extensions. This kind of change could happen at any > point in time > > > because at the end of the day a postal code is an *arbitrary* > identifier. > > > In > > > other words there is no arithmetic meaning in a zip code, so it should > not > > > be stored as a number. You CAN store a 5 digit zip code as a Long Int, > but > > > you'll regret it if "the powers that be" decide to change things later > on. > > > > > > As for searching for strings in a sorted list, while it is slower to > > > compare > > > a whole string with another whole string, vs. comparing > numeric values, > to > > > search a sorted list really does not take that long as a > standard binary > > > chop approach is going to zero in very quickly on the target > string. (If > > > my > > > mental arithmetic is up to it, searching for a string in a list of one > > > billion (10^9) strings would require 30 comparisons at most.) In > addition, > > > the initial stages of the binary chop only need to compare the first > > > character (or first few) in the string with the first character (or > first > > > few) of the target, not much slower than integer data comparisons. And > > > it's > > > just as simple to figure out if a target is within some range. > > > > > > > > > > > > Lambert > > > > > > > -----Original Message----- > > > > From: Drew Wutka [SMTP:DWUTKA at marlow.com] > > > > Sent: Tuesday, October 14, 2003 5:18 PM > > > > To: 'Access Developers discussion and problem solving' > > > > Subject: RE: [AccessD] Number vs text data type > > > > > > > > Yes, but if something can be stored JUST as a number, it is > not only a > > > > storage space issue, but a speed issue. If you took just > the 5 digit > > > Zip > > > > code, (ignoring any extensions), and you wanted to search for > addresses > > > in > > > > the range of 70000 to 80000, if the numbers are stored as text, the > > > > processor is going to have to compare each digit of the number, > because > > > > it's > > > > looking at ASCII (or unicode) and NOT a number. However, if it's > stored > > > > as > > > > a number, then it's a VERY simple check to determine if it falls in > that > > > > range! > > > > > > > > Drew > > > > > > > > -----Original Message----- > > > > From: Charlotte Foust [mailto:cfoust at infostatsystems.com] > > > > Sent: Monday, October 13, 2003 10:48 AM > > > > To: Access Developers discussion and problem solving > > > > Subject: RE: [AccessD] Number vs text data type > > > > > > > > > > > > Same holds true for social security numbers in the US and for > telephone > > > > numbers everywhere. There are good reasons NOT to use numbers for > some > > > > kinds of numeric data, which is probably where the argument > comes from > > > > in the first place, that and the old approach of squeezing > everything > > > > into the smallest possible datatype to shave storage bytes. > > > > > > > > Charlotte Foust > > > > > > > > -----Original Message----- > > > > From: Susan Harkins [mailto:ssharkins at bellsouth.net] > > > > Sent: Monday, October 13, 2003 7:32 AM > > > > To: Access Developers discussion and problem solving > > > > Subject: Re: [AccessD] Number vs text data type > > > > > > > > > > > > What about Zip Codes, etc... what possible purpose would > you have for > > > > treating such an entry as a number? > > > > > > > > I'm afraid this isn't old school -- it's still very alive > and with us. > > > > > > > > Susan H. > > > > > > > > > > > > > That sounds like one of my co-workers who comes from the > > > > > old school early-days C and COBOL programming. He makes everything > > > > > text unless an actual calculation must be performed, even > when that > > > > > doesn't necessarily make sense in context. I believe that if its a > > > > > number it should be treated as a number unless there is a very > > > > > compelling reason not to, for all of the reasons given so far and > > > > > probably others I haven't thought of. To me, logic tells > > > > > you that if the data is numeric you should use a number > > > > > data type, using text just muddies things up. As far as a > > > > > number field needing "additional resources", especially > > > > > for calculations, that makes no sense at all to me. Some > > > > > number types may take additional storage space as compared > > > > > to holding the same digit characters in a text type, and > > > > > that can become an issue for DBAs when they are working > > > > > under short drive space conditions. Other than, I can't > > > > > think of a resource reason. > > > > > > > > > > Ron > > > > > > > > > > On Mon, 13 Oct 2003 10:42:15 -0400 > > > > > CYNTHIA SPELL <CSPELL at jhuccp.org> wrote: > > > > > >I will, that's a good idea. What I've heard so far is > > > > > >that logic tells you that if you don't need to calculate, you > should > > > > > >use text. And that a number data type requires additional > resources > > > > > >in order for the field to have the ability to do the calculating. > > > > > _______________________________________________ > > > > > AccessD mailing list > > > > > AccessD at databaseadvisors.com > > > > > http://databaseadvisors.com/mailman/listinfo/accessd > > > > > Website: http://www.databaseadvisors.com > > > > > > > > > > > > > _______________________________________________ > > > > AccessD mailing list > > > > AccessD at databaseadvisors.com > > > > http://databaseadvisors.com/mailman/listinfo/accessd > > > > Website: http://www.databaseadvisors.com > > > > _______________________________________________ > > > > AccessD mailing list > > > > AccessD at databaseadvisors.com > > > > http://databaseadvisors.com/mailman/listinfo/accessd > > > > Website: http://www.databaseadvisors.com > > > > _______________________________________________ > > > > AccessD mailing list > > > > AccessD at databaseadvisors.com > > > > http://databaseadvisors.com/mailman/listinfo/accessd > > > > Website: http://www.databaseadvisors.com > > > _______________________________________________ > > > AccessD mailing list > > > AccessD at databaseadvisors.com > > > http://databaseadvisors.com/mailman/listinfo/accessd > > > Website: http://www.databaseadvisors.com > > > _______________________________________________ > > > AccessD mailing list > > > AccessD at databaseadvisors.com > > > http://databaseadvisors.com/mailman/listinfo/accessd > > > Website: http://www.databaseadvisors.com > > _______________________________________________ > > AccessD mailing list > > AccessD at databaseadvisors.com > > http://databaseadvisors.com/mailman/listinfo/accessd > > Website: http://www.databaseadvisors.com > > _______________________________________________ > > AccessD mailing list > > AccessD at databaseadvisors.com > > http://databaseadvisors.com/mailman/listinfo/accessd > > Website: http://www.databaseadvisors.com > > > > > _______________________________________________ > AccessD mailing list > AccessD at databaseadvisors.com > http://databaseadvisors.com/mailman/listinfo/accessd > Website: http://www.databaseadvisors.com > >