[AccessD] Number vs text data type

John B. john at winhaven.net
Wed Oct 15 22:39:55 CDT 2003


LOL!

Actually, I really appreciate these adventures in theory - although I would
imagine by now the original poster of this thread no longer cares! Although
most of these issues have no right or wrong, every now and again I do change
my outlook on an issue because of these threads. (I did it with regards to
RVBA a few a years ago.)

This thread has me re-thinking my guideline on how to type data. I've
appended the following exception:
Type data field by the data's purpose.
Numerical Type for numbers that are numerical values;
Text Type for numbers that are item identifiers;
Exception: if performance is an extremely important issue, then consider
storing numerical strings in a numerically typed field.

Thanks all! Its been another fun learning experience.

John B.

> -----Original Message-----
> From: accessd-bounces at databaseadvisors.com
> [mailto:accessd-bounces at databaseadvisors.com]On Behalf Of William
> Hindman
> Sent: Wednesday, October 15, 2003 3:05 PM
> To: Access Developers discussion and problem solving
> Subject: Re: [AccessD] Number vs text data type
>
>
> "Why sacrifice any speed, to accomodate unknown future variables?" Drew
>
> ...said the code guru as he inserted two digit year dates into his new
> program back in '83 :))))))
>
> William Hindman
> "The future will not belong to those who are cynical or
> those who stand on the sidelines. The future will belong to those who
> have passion and are willing to work hard to make our country better."
> ...Paul Wellstone "The Conscience of a Liberal"
>
>
> ----- Original Message -----
> From: "Drew Wutka" <DWUTKA at marlow.com>
> To: "'Access Developers discussion and problem solving'"
> <accessd at databaseadvisors.com>
> Sent: Wednesday, October 15, 2003 3:11 PM
> Subject: RE: [AccessD] Number vs text data type
>
>
> > You are right, Zip Codes could change into anything.  But that
> applies to
> > everything we know.  Should we make EVERY field a string?  You can only
> code
> > for the present, and predicted future.  The post office isn't going to
> > change the Zip Code system (at least not the basic 5 digits) without a
> > decent amount of lead/warning time.  They have systems they'd have to
> modify
> > also.
> >
> > Also, you are right, with today's processors, speed is not as critical.
> But
> > the question is, when faced with a factor of 1.5x to 2x faster,
> do you go
> > with the slower process, or the faster process?  Isn't it
> better to stick
> > with the faster process?  Why sacrifice any speed, to accomodate unknown
> > future variables?
> >
> > Drew
> >
> > -----Original Message-----
> > From: Heenan, Lambert [mailto:Lambert.Heenan at aig.com]
> > Sent: Wednesday, October 15, 2003 1:47 PM
> > To: 'Access Developers discussion and problem solving'
> > Subject: RE: [AccessD] Number vs text data type
> >
> >
> > Ah but...
> >
> > one day in the future the design of zip codes (which is only ONE example
> of
> > a "number" that is not a number) could well be changed to something
> utterly
> > different that looks nothing like a number. You can't just keep adding
> > fields and complex put-it-all-back-together routines to reconstruct the
> > original string from a combination of strings and numbers. Well you can,
> but
> > it might make life miserable as "the management" make more wacky
> decisions.
> >
> > As for searching for string vs numbers. Here are some results
> on a test Db
> I
> > threw together.
> >
> > Using a table with 581,672 records: each record has a longint
> field and a
> > string field. The long field is populated with random long
> integer values,
> > and the string field has the string representation of its corresponding
> > integer field. Both fields are indexed with duplicates allowed.
> >
> > The table is in an MDB file (44.9 MBytes) which is on a network
> drive, and
> > the network is running at just 10-mbit. So then I ran the following code
> to
> > search for random 'numbers' both as numbers and as strings...
> >
> > Sub testSearch()
> > Dim n As Long
> > Dim s As String
> >
> > Dim TextSearchStart As Single, TextSearchEnd As Single
> > Dim NumericSearchStart As Single, NumericSearchEnd As Single
> > Dim TextSearchTime As Single
> > Dim NumericSearchTime As Single
> > Dim x As Long
> > Dim strHits As Long
> > Dim intHits As Long
> > Dim db As Database
> > Dim rs As Recordset
> > Dim sCriteria As String
> >     Randomize
> >     Set rs = OpenDynaset(db, "tblHuge")
> >     For x = 1 To 1000
> >         With rs
> >             n = CLng(Rnd() * 2 ^ 31)
> >             s = CStr(n)
> >
> >             sCriteria = "nNumber=" & n
> >             NumericSearchStart = Timer
> >             .FindFirst sCriteria
> >             intHits = intHits + Abs(CLng(Not .NoMatch))
> >             NumericSearchEnd = Timer
> >             NumericSearchTime = NumericSearchTime + (NumericSearchEnd -
> > NumericSearchStart)
> >             sCriteria = "sText='" & s & "'"
> >             TextSearchStart = Timer
> >             .FindFirst sCriteria
> >             strHits = strHits + Abs(CLng(Not .NoMatch))
> >             TextSearchEnd = Timer
> >             TextSearchTime = TextSearchTime + (TextSearchEnd -
> > TextSearchStart)
> >         End With
> >     Next x
> >     rs.Close
> >     Debug.Print "Average find times"
> >     Debug.Print "Numbers:" & Format(NumericSearchTime / 1000,
> > "0.0000000000") & " Hits:" & intHits
> >     Debug.Print "Strings:" & Format(TextSearchTime / 1000,
> "0.0000000000")
> &
> > " Hits:" & strHits
> >     Set db = Nothing
> >     Set rs = Nothing
> > End Sub
> >
> >
> > And here are the results of a few runs, last to first...
> >
> > Average find times
> > Numbers:0.0020859370 Hits:0
> > Strings:0.0025468750 Hits:0
> > Average find times
> > Numbers:0.0023203120 Hits:0
> > Strings:0.0030234380 Hits:0
> > Average find times
> > Numbers:0.0023984380 Hits:0
> > Strings:0.0040195310 Hits:0
> > Average find times
> > Numbers:0.0029257810 Hits:0
> > Strings:0.0046445310 Hits:0
> >
> > As you can see the random search values never generated any
> hits - not too
> > surprising considering the number of possible values in 32bit long.
> >
> > The numeric search times were fairly consistent, and the string search
> times
> > steadily dropped. I assume that there is some sort of caching going on
> here.
> > But the main point is that at worst it took just about twice as long to
> > search for the string as it did to find the long. Not *such* as huge
> > performance hit considering we are talking about thousandths of
> a second.
> > There is also a fair bit or arithmetic going on between each search.
> >
> > I goes without say (so why am I saying it?) that whne yo take out the
> > indexes on the fields the search times go though the roof (for the Long
> > field as well as the string one).
> >
> > Lambert
> >
> > > -----Original Message-----
> > > From: Drew Wutka [SMTP:DWUTKA at marlow.com]
> > > Sent: Wednesday, October 15, 2003 12:26 PM
> > > To: 'Access Developers discussion and problem solving'
> > > Subject: RE: [AccessD] Number vs text data type
> > >
> > > Ah, but you can always add fields to handle new anomalies!
> With the zip
> > > code, just add an 'extension' field.  Original Number field is left
> intact
> > > (and all 'search' routines) would still work.
> > >
> > > Drew
> > >
> > > -----Original Message-----
> > > From: Heenan, Lambert [mailto:Lambert.Heenan at aig.com]
> > > Sent: Wednesday, October 15, 2003 8:38 AM
> > > To: 'Access Developers discussion and problem solving'
> > > Subject: RE: [AccessD] Number vs text data type
> > >
> > >
> > > This thread has been going on quite some time now, so why should I let
> it
> > > stop :-)
> > >
> > > One key issue that which has not yet been mentioned but should be
> > > considered
> > > is is there any possibility that the design / structure of the
> "numerical"
> > > identifier might change?
> > >
> > > To take the example of US Zip codes. Once upon a time there
> were no Zip
> > > codes, then along came 5 digit zips, and a little later the
> hyphen plus
> > > four
> > > digit extensions. This kind of change could happen at any
> point in time
> > > because at the end of the day a postal code is an *arbitrary*
> identifier.
> > > In
> > > other words there is no arithmetic meaning in a zip code, so it should
> not
> > > be stored as a number. You CAN store a 5 digit zip code as a Long Int,
> but
> > > you'll regret it if "the powers that be" decide to change things later
> on.
> > >
> > > As for searching for strings in a sorted list, while it is slower to
> > > compare
> > > a whole string with another whole string, vs. comparing
> numeric values,
> to
> > > search a sorted list really does not take that long as a
> standard binary
> > > chop approach is going to zero in very quickly on the target
> string. (If
> > > my
> > > mental arithmetic is up to it, searching for a string in a list of one
> > > billion (10^9) strings would require 30 comparisons at most.) In
> addition,
> > > the initial stages of the binary chop only need to compare the first
> > > character (or first few) in the string with the first character (or
> first
> > > few) of the target, not much slower than integer data comparisons. And
> > > it's
> > > just as simple to figure out if a target is within some range.
> > >
> > >
> > >
> > > Lambert
> > >
> > > > -----Original Message-----
> > > > From: Drew Wutka [SMTP:DWUTKA at marlow.com]
> > > > Sent: Tuesday, October 14, 2003 5:18 PM
> > > > To: 'Access Developers discussion and problem solving'
> > > > Subject: RE: [AccessD] Number vs text data type
> > > >
> > > > Yes, but if something can be stored JUST as a number, it is
> not only a
> > > > storage space issue, but a speed issue.  If you took just
> the 5 digit
> > > Zip
> > > > code, (ignoring any extensions), and you wanted to search for
> addresses
> > > in
> > > > the range of 70000 to 80000, if the numbers are stored as text, the
> > > > processor is going to have to compare each digit of the number,
> because
> > > > it's
> > > > looking at ASCII (or unicode) and NOT a number.  However, if it's
> stored
> > > > as
> > > > a number, then it's a VERY simple check to determine if it falls in
> that
> > > > range!
> > > >
> > > > Drew
> > > >
> > > > -----Original Message-----
> > > > From: Charlotte Foust [mailto:cfoust at infostatsystems.com]
> > > > Sent: Monday, October 13, 2003 10:48 AM
> > > > To: Access Developers discussion and problem solving
> > > > Subject: RE: [AccessD] Number vs text data type
> > > >
> > > >
> > > > Same holds true for social security numbers in the US and for
> telephone
> > > > numbers everywhere.  There are good reasons NOT to use numbers for
> some
> > > > kinds of numeric data, which is probably where the argument
> comes from
> > > > in the first place, that and the old approach of squeezing
> everything
> > > > into the smallest possible datatype to shave storage bytes.
> > > >
> > > > Charlotte Foust
> > > >
> > > > -----Original Message-----
> > > > From: Susan Harkins [mailto:ssharkins at bellsouth.net]
> > > > Sent: Monday, October 13, 2003 7:32 AM
> > > > To: Access Developers discussion and problem solving
> > > > Subject: Re: [AccessD] Number vs text data type
> > > >
> > > >
> > > > What about Zip Codes, etc... what possible purpose would
> you have for
> > > > treating such an entry as a number?
> > > >
> > > > I'm afraid this isn't old school -- it's still very alive
> and with us.
> > > >
> > > > Susan H.
> > > >
> > > >
> > > > > That sounds like one of my co-workers who comes from the
> > > > > old school early-days C and COBOL programming. He makes everything
> > > > > text unless an actual calculation must be performed, even
> when that
> > > > > doesn't necessarily make sense in context. I believe that if its a
> > > > > number it should be treated as a number unless there is a very
> > > > > compelling reason not to, for all of the reasons given so far and
> > > > > probably others I haven't thought of. To me, logic tells
> > > > > you that if the data is numeric you should use a number
> > > > > data type, using text just muddies things up. As far as a
> > > > > number field needing "additional resources", especially
> > > > > for calculations, that makes no sense at all to me. Some
> > > > > number types may take additional storage space as compared
> > > > > to holding the same digit characters in a text type, and
> > > > > that can become an issue for DBAs when they are working
> > > > > under short drive space conditions. Other than, I can't
> > > > > think of a resource reason.
> > > > >
> > > > > Ron
> > > > >
> > > > > On Mon, 13 Oct 2003 10:42:15 -0400
> > > > >   CYNTHIA SPELL <CSPELL at jhuccp.org> wrote:
> > > > > >I will, that's a good idea.  What I've heard so far is
> > > > > >that logic tells you that if you don't need to calculate, you
> should
> > > > > >use text.  And that a number data type requires additional
> resources
> > > > > >in order for the field to have the ability to do the calculating.
> > > > > _______________________________________________
> > > > > AccessD mailing list
> > > > > AccessD at databaseadvisors.com
> > > > > http://databaseadvisors.com/mailman/listinfo/accessd
> > > > > Website: http://www.databaseadvisors.com
> > > > >
> > > >
> > > > _______________________________________________
> > > > AccessD mailing list
> > > > AccessD at databaseadvisors.com
> > > > http://databaseadvisors.com/mailman/listinfo/accessd
> > > > Website: http://www.databaseadvisors.com
> > > > _______________________________________________
> > > > AccessD mailing list
> > > > AccessD at databaseadvisors.com
> > > > http://databaseadvisors.com/mailman/listinfo/accessd
> > > > Website: http://www.databaseadvisors.com
> > > > _______________________________________________
> > > > AccessD mailing list
> > > > AccessD at databaseadvisors.com
> > > > http://databaseadvisors.com/mailman/listinfo/accessd
> > > > Website: http://www.databaseadvisors.com
> > > _______________________________________________
> > > AccessD mailing list
> > > AccessD at databaseadvisors.com
> > > http://databaseadvisors.com/mailman/listinfo/accessd
> > > Website: http://www.databaseadvisors.com
> > > _______________________________________________
> > > AccessD mailing list
> > > AccessD at databaseadvisors.com
> > > http://databaseadvisors.com/mailman/listinfo/accessd
> > > Website: http://www.databaseadvisors.com
> > _______________________________________________
> > AccessD mailing list
> > AccessD at databaseadvisors.com
> > http://databaseadvisors.com/mailman/listinfo/accessd
> > Website: http://www.databaseadvisors.com
> > _______________________________________________
> > AccessD mailing list
> > AccessD at databaseadvisors.com
> > http://databaseadvisors.com/mailman/listinfo/accessd
> > Website: http://www.databaseadvisors.com
> >
>
>
> _______________________________________________
> AccessD mailing list
> AccessD at databaseadvisors.com
> http://databaseadvisors.com/mailman/listinfo/accessd
> Website: http://www.databaseadvisors.com
>
>




More information about the AccessD mailing list