[AccessD] Number vs text data type

CYNTHIA SPELL CSPELL at jhuccp.org
Thu Oct 16 07:32:00 CDT 2003


>From the original poster...  

Wow, that was a little (lot??!) more than I expected!    

The Budget Number field we use is really an identifier, not a value.  (Perhaps that "old school" notion of using text unless an actual calculation must be performed was not so far off.)  In any event, while speed and storage space should always be considered, the field is an organizational format and it's quite possible that it could change.  So you've given me something to think about.

Thank you all for your responses.  They were most interesting and informative, as usual...

- cindy

>>> john at winhaven.net 10/15/03 11:39PM >>>
LOL!

Actually, I really appreciate these adventures in theory - although I would
imagine by now the original poster of this thread no longer cares! Although
most of these issues have no right or wrong, every now and again I do change
my outlook on an issue because of these threads. (I did it with regards to
RVBA a few a years ago.)

This thread has me re-thinking my guideline on how to type data. I've
appended the following exception:
Type data field by the data's purpose.
Numerical Type for numbers that are numerical values;
Text Type for numbers that are item identifiers;
Exception: if performance is an extremely important issue, then consider
storing numerical strings in a numerically typed field.

Thanks all! Its been another fun learning experience.

John B.

> -----Original Message-----
> From: accessd-bounces at databaseadvisors.com 
> [mailto:accessd-bounces at databaseadvisors.com]On Behalf Of William
> Hindman
> Sent: Wednesday, October 15, 2003 3:05 PM
> To: Access Developers discussion and problem solving
> Subject: Re: [AccessD] Number vs text data type
>
>
> "Why sacrifice any speed, to accomodate unknown future variables?" Drew
>
> ...said the code guru as he inserted two digit year dates into his new
> program back in '83 :))))))
>
> William Hindman
> "The future will not belong to those who are cynical or
> those who stand on the sidelines. The future will belong to those who
> have passion and are willing to work hard to make our country better."
> ...Paul Wellstone "The Conscience of a Liberal"
>
>
> ----- Original Message -----
> From: "Drew Wutka" <DWUTKA at marlow.com>
> To: "'Access Developers discussion and problem solving'"
> <accessd at databaseadvisors.com>
> Sent: Wednesday, October 15, 2003 3:11 PM
> Subject: RE: [AccessD] Number vs text data type
>
>
> > You are right, Zip Codes could change into anything.  But that
> applies to
> > everything we know.  Should we make EVERY field a string?  You can only
> code
> > for the present, and predicted future.  The post office isn't going to
> > change the Zip Code system (at least not the basic 5 digits) without a
> > decent amount of lead/warning time.  They have systems they'd have to
> modify
> > also.
> >
> > Also, you are right, with today's processors, speed is not as critical.
> But
> > the question is, when faced with a factor of 1.5x to 2x faster,
> do you go
> > with the slower process, or the faster process?  Isn't it
> better to stick
> > with the faster process?  Why sacrifice any speed, to accomodate unknown
> > future variables?
> >
> > Drew
> >
> > -----Original Message-----
> > From: Heenan, Lambert [mailto:Lambert.Heenan at aig.com] 
> > Sent: Wednesday, October 15, 2003 1:47 PM
> > To: 'Access Developers discussion and problem solving'
> > Subject: RE: [AccessD] Number vs text data type
> >
> >
> > Ah but...
> >
> > one day in the future the design of zip codes (which is only ONE example
> of
> > a "number" that is not a number) could well be changed to something
> utterly
> > different that looks nothing like a number. You can't just keep adding
> > fields and complex put-it-all-back-together routines to reconstruct the
> > original string from a combination of strings and numbers. Well you can,
> but
> > it might make life miserable as "the management" make more wacky
> decisions.
> >
> > As for searching for string vs numbers. Here are some results
> on a test Db
> I
> > threw together.
> >
> > Using a table with 581,672 records: each record has a longint
> field and a
> > string field. The long field is populated with random long
> integer values,
> > and the string field has the string representation of its corresponding
> > integer field. Both fields are indexed with duplicates allowed.
> >
> > The table is in an MDB file (44.9 MBytes) which is on a network
> drive, and
> > the network is running at just 10-mbit. So then I ran the following code
> to
> > search for random 'numbers' both as numbers and as strings...
> >
> > Sub testSearch()
> > Dim n As Long
> > Dim s As String
> >
> > Dim TextSearchStart As Single, TextSearchEnd As Single
> > Dim NumericSearchStart As Single, NumericSearchEnd As Single
> > Dim TextSearchTime As Single
> > Dim NumericSearchTime As Single
> > Dim x As Long
> > Dim strHits As Long
> > Dim intHits As Long
> > Dim db As Database
> > Dim rs As Recordset
> > Dim sCriteria As String
> >     Randomize
> >     Set rs = OpenDynaset(db, "tblHuge")
> >     For x = 1 To 1000
> >         With rs
> >             n = CLng(Rnd() * 2 ^ 31)
> >             s = CStr(n)
> >
> >             sCriteria = "nNumber=" & n
> >             NumericSearchStart = Timer
> >             .FindFirst sCriteria
> >             intHits = intHits + Abs(CLng(Not .NoMatch))
> >             NumericSearchEnd = Timer
> >             NumericSearchTime = NumericSearchTime + (NumericSearchEnd -
> > NumericSearchStart)
> >             sCriteria = "sText='" & s & "'"
> >             TextSearchStart = Timer
> >             .FindFirst sCriteria
> >             strHits = strHits + Abs(CLng(Not .NoMatch))
> >             TextSearchEnd = Timer
> >             TextSearchTime = TextSearchTime + (TextSearchEnd -
> > TextSearchStart)
> >         End With
> >     Next x
> >     rs.Close
> >     Debug.Print "Average find times"
> >     Debug.Print "Numbers:" & Format(NumericSearchTime / 1000,
> > "0.0000000000") & " Hits:" & intHits
> >     Debug.Print "Strings:" & Format(TextSearchTime / 1000,
> "0.0000000000")
> &
> > " Hits:" & strHits
> >     Set db = Nothing
> >     Set rs = Nothing
> > End Sub
> >
> >
> > And here are the results of a few runs, last to first...
> >
> > Average find times
> > Numbers:0.0020859370 Hits:0
> > Strings:0.0025468750 Hits:0
> > Average find times
> > Numbers:0.0023203120 Hits:0
> > Strings:0.0030234380 Hits:0
> > Average find times
> > Numbers:0.0023984380 Hits:0
> > Strings:0.0040195310 Hits:0
> > Average find times
> > Numbers:0.0029257810 Hits:0
> > Strings:0.0046445310 Hits:0
> >
> > As you can see the random search values never generated any
> hits - not too
> > surprising considering the number of possible values in 32bit long.
> >
> > The numeric search times were fairly consistent, and the string search
> times
> > steadily dropped. I assume that there is some sort of caching going on
> here.
> > But the main point is that at worst it took just about twice as long to
> > search for the string as it did to find the long. Not *such* as huge
> > performance hit considering we are talking about thousandths of
> a second.
> > There is also a fair bit or arithmetic going on between each search.
> >
> > I goes without say (so why am I saying it?) that whne yo take out the
> > indexes on the fields the search times go though the roof (for the Long
> > field as well as the string one).
> >
> > Lambert
> >
> > > -----Original Message-----
> > > From: Drew Wutka [SMTP:DWUTKA at marlow.com] 
> > > Sent: Wednesday, October 15, 2003 12:26 PM
> > > To: 'Access Developers discussion and problem solving'
> > > Subject: RE: [AccessD] Number vs text data type
> > >
> > > Ah, but you can always add fields to handle new anomalies!
> With the zip
> > > code, just add an 'extension' field.  Original Number field is left
> intact
> > > (and all 'search' routines) would still work.
> > >
> > > Drew
> > >
> > > -----Original Message-----
> > > From: Heenan, Lambert [mailto:Lambert.Heenan at aig.com] 
> > > Sent: Wednesday, October 15, 2003 8:38 AM
> > > To: 'Access Developers discussion and problem solving'
> > > Subject: RE: [AccessD] Number vs text data type
> > >
> > >
> > > This thread has been going on quite some time now, so why should I let
> it
> > > stop :-)
> > >
> > > One key issue that which has not yet been mentioned but should be
> > > considered
> > > is is there any possibility that the design / structure of the
> "numerical"
> > > identifier might change?
> > >
> > > To take the example of US Zip codes. Once upon a time there
> were no Zip
> > > codes, then along came 5 digit zips, and a little later the
> hyphen plus
> > > four
> > > digit extensions. This kind of change could happen at any
> point in time
> > > because at the end of the day a postal code is an *arbitrary*
> identifier.
> > > In
> > > other words there is no arithmetic meaning in a zip code, so it should
> not
> > > be stored as a number. You CAN store a 5 digit zip code as a Long Int,
> but
> > > you'll regret it if "the powers that be" decide to change things later
> on.
> > >
> > > As for searching for strings in a sorted list, while it is slower to
> > > compare
> > > a whole string with another whole string, vs. comparing
> numeric values,
> to
> > > search a sorted list really does not take that long as a
> standard binary
> > > chop approach is going to zero in very quickly on the target
> string. (If
> > > my
> > > mental arithmetic is up to it, searching for a string in a list of one
> > > billion (10^9) strings would require 30 comparisons at most.) In
> addition,
> > > the initial stages of the binary chop only need to compare the first
> > > character (or first few) in the string with the first character (or
> first
> > > few) of the target, not much slower than integer data comparisons. And
> > > it's
> > > just as simple to figure out if a target is within some range.
> > >
> > >
> > >
> > > Lambert
> > >
> > > > -----Original Message-----
> > > > From: Drew Wutka [SMTP:DWUTKA at marlow.com] 
> > > > Sent: Tuesday, October 14, 2003 5:18 PM
> > > > To: 'Access Developers discussion and problem solving'
> > > > Subject: RE: [AccessD] Number vs text data type
> > > >
> > > > Yes, but if something can be stored JUST as a number, it is
> not only a
> > > > storage space issue, but a speed issue.  If you took just
> the 5 digit
> > > Zip
> > > > code, (ignoring any extensions), and you wanted to search for
> addresses
> > > in
> > > > the range of 70000 to 80000, if the numbers are stored as text, the
> > > > processor is going to have to compare each digit of the number,
> because
> > > > it's
> > > > looking at ASCII (or unicode) and NOT a number.  However, if it's
> stored
> > > > as
> > > > a number, then it's a VERY simple check to determine if it falls in
> that
> > > > range!
> > > >
> > > > Drew
> > > >
> > > > -----Original Message-----
> > > > From: Charlotte Foust [mailto:cfoust at infostatsystems.com] 
> > > > Sent: Monday, October 13, 2003 10:48 AM
> > > > To: Access Developers discussion and problem solving
> > > > Subject: RE: [AccessD] Number vs text data type
> > > >
> > > >
> > > > Same holds true for social security numbers in the US and for
> telephone
> > > > numbers everywhere.  There are good reasons NOT to use numbers for
> some
> > > > kinds of numeric data, which is probably where the argument
> comes from
> > > > in the first place, that and the old approach of squeezing
> everything
> > > > into the smallest possible datatype to shave storage bytes.
> > > >
> > > > Charlotte Foust
> > > >
> > > > -----Original Message-----
> > > > From: Susan Harkins [mailto:ssharkins at bellsouth.net] 
> > > > Sent: Monday, October 13, 2003 7:32 AM
> > > > To: Access Developers discussion and problem solving
> > > > Subject: Re: [AccessD] Number vs text data type
> > > >
> > > >
> > > > What about Zip Codes, etc... what possible purpose would
> you have for
> > > > treating such an entry as a number?
> > > >
> > > > I'm afraid this isn't old school -- it's still very alive
> and with us.
> > > >
> > > > Susan H.
> > > >
> > > >
> > > > > That sounds like one of my co-workers who comes from the
> > > > > old school early-days C and COBOL programming. He makes everything
> > > > > text unless an actual calculation must be performed, even
> when that
> > > > > doesn't necessarily make sense in context. I believe that if its a
> > > > > number it should be treated as a number unless there is a very
> > > > > compelling reason not to, for all of the reasons given so far and
> > > > > probably others I haven't thought of. To me, logic tells
> > > > > you that if the data is numeric you should use a number
> > > > > data type, using text just muddies things up. As far as a
> > > > > number field needing "additional resources", especially
> > > > > for calculations, that makes no sense at all to me. Some
> > > > > number types may take additional storage space as compared
> > > > > to holding the same digit characters in a text type, and
> > > > > that can become an issue for DBAs when they are working
> > > > > under short drive space conditions. Other than, I can't
> > > > > think of a resource reason.
> > > > >
> > > > > Ron
> > > > >
> > > > > On Mon, 13 Oct 2003 10:42:15 -0400
> > > > >   CYNTHIA SPELL <CSPELL at jhuccp.org> wrote:
> > > > > >I will, that's a good idea.  What I've heard so far is
> > > > > >that logic tells you that if you don't need to calculate, you
> should
> > > > > >use text.  And that a number data type requires additional
> resources
> > > > > >in order for the field to have the ability to do the calculating.
> > > > > _______________________________________________
> > > > > AccessD mailing list
> > > > > AccessD at databaseadvisors.com 
> > > > > http://databaseadvisors.com/mailman/listinfo/accessd 
> > > > > Website: http://www.databaseadvisors.com 
> > > > >
> > > >
> > > > _______________________________________________
> > > > AccessD mailing list
> > > > AccessD at databaseadvisors.com 
> > > > http://databaseadvisors.com/mailman/listinfo/accessd 
> > > > Website: http://www.databaseadvisors.com 
> > > > _______________________________________________
> > > > AccessD mailing list
> > > > AccessD at databaseadvisors.com 
> > > > http://databaseadvisors.com/mailman/listinfo/accessd 
> > > > Website: http://www.databaseadvisors.com 
> > > > _______________________________________________
> > > > AccessD mailing list
> > > > AccessD at databaseadvisors.com 
> > > > http://databaseadvisors.com/mailman/listinfo/accessd 
> > > > Website: http://www.databaseadvisors.com 
> > > _______________________________________________
> > > AccessD mailing list
> > > AccessD at databaseadvisors.com 
> > > http://databaseadvisors.com/mailman/listinfo/accessd 
> > > Website: http://www.databaseadvisors.com 
> > > _______________________________________________
> > > AccessD mailing list
> > > AccessD at databaseadvisors.com 
> > > http://databaseadvisors.com/mailman/listinfo/accessd 
> > > Website: http://www.databaseadvisors.com 
> > _______________________________________________
> > AccessD mailing list
> > AccessD at databaseadvisors.com 
> > http://databaseadvisors.com/mailman/listinfo/accessd 
> > Website: http://www.databaseadvisors.com 
> > _______________________________________________
> > AccessD mailing list
> > AccessD at databaseadvisors.com 
> > http://databaseadvisors.com/mailman/listinfo/accessd 
> > Website: http://www.databaseadvisors.com 
> >
>
>
> _______________________________________________
> AccessD mailing list
> AccessD at databaseadvisors.com 
> http://databaseadvisors.com/mailman/listinfo/accessd 
> Website: http://www.databaseadvisors.com 
>
>


_______________________________________________
AccessD mailing list
AccessD at databaseadvisors.com 
http://databaseadvisors.com/mailman/listinfo/accessd 
Website: http://www.databaseadvisors.com


More information about the AccessD mailing list