[AccessD] Regular Expressions: was Address Recognition

Michael Bahr jedi at charm.net
Mon Jun 18 19:44:42 CDT 2007


I was working on an electronic parts inventory database last year.  Most of
the parts have the value mixed in with the part number; you just need to
know where to find this information and how to interpolate the values.  I
created a function to run and look at each row pulling the part number and
description from respective columns and passing these two items to
functions.  The functions either return a null or value.  If the function
returns a value then stop processing, use the value elsewhere and then move
to next row.  FYI, this works in A2000 and A2003.

'test samples from partnumber column
'pn = M22759/34-12-0
'pn = M22759/16-18-9
'pn = M22759/11-8-0
'desc = Wire from description column
'
Function getWireValue(pn As Variant, desc As Variant) As Variant
   Dim myValue As Variant, thisColor As String
   Dim color As Variant

   color = Array("Black", "Brown", "Red", "Orange", "Yellow", "Green",
"Blue", "Purple", "Gray", "White")
   myValue = Null

   ' try to extract value from part number
   If (InStr(1, desc, "Wire") > 0) Then

      Set regex = CreateObject("VBScript.RegExp")
      regexMatch = True   ' initialize var, optional
      regex.Global = True ' initialize property, optional
      regex.ignorecase = True  ' most cases you want this

      ' start with generic part numbers
      regex.pattern = "(/\d+-(\d+)-(\d))$"  ' define regex
      regexMatch = regex.test(pn)  ' now test pattern against string

      If (regexMatch) Then
         ' match{es) from part number
         Set matches = regex.Execute(pn)
         ' get sub-matches
         myValue = matches(0).submatches(1)
         thisColor = matches(0).submatches(2)
         myValue = Replace(myValue, "-", "")

         myValue = myValue & " ga " & color(thisColor)
      End If

      Set regex = Nothing
   End If

   getWireValue = myValue

End Function


Here is one of many functions I wrote using regular expressions (regex) to
extract information from a string.  This function was the easiest and
smallest from my modules to demonstrate.  Some of the code should be obvious
but the regex part I will discuss in more detail.

The first and biggest thing that must be done (in VB & perhaps .Net) is to
create the regex object.  Once the object has been created a bunch of
properties are available (they can be viewed when stepping through the code
manually).  Some of the properties that should be set right away are the
Global and ignorecase.  The Global property is set to True IF you want
multiple matches or want a count otherwise matching will stop at the first
match.  The ignorecase property does just what is sez, it ignore the case of
the string and/or the pattern.  Setting regexMatch to True is optional, I do
not remember if it is all that important or not.

Now the fun starts.  The pattern has to be defined.  It is really not the
hard.  Just look at your string and decide which part needs to be evaluated.
The string can be a word, sentence, paragraph, etc.  Once a pattern is
established as in regex.pattern = "..." then do the test against the string
as in regexMatch = regex.test(pn).

There are three ways to look for a pattern; at the beginning of string,
anywhere within string, or at the end of string.  The ^ means look at the
beginning and $ means look at the end of string.

Parentheses are important.  They allow for grouping your patterns for
further evaluation.  For VB and .Net the matched groups are saved in the
matches(0) properties.  The group numbering starts from the outside of the
regex and increments inwards.  For example, above I am interested in the 2nd
and 3rd matches or parenthesis groups which are placed in
matches(0).submatches(1) and matches(0).submatches(2).

Here is another pattern I used in another function to pull the fuse value
from the part number;

      regex.pattern = "((\d+[A-Z]+([/\d]+)[A-Z]+)|(-\d\d(\d\d)))$"

This actually has 2 parts that have been OR'd together.  The OR symbol is
the pipe or | character.  So if you look closely I am combining 2 different
patterns into one expression:
(\d+[A-Z]+([/\d]+)[A-Z]+) and
(-\d\d(\d\d))
You can not AND but only OR patterns.

All this looks very daunting so please do not be afraid to explore using
regex's.  You can do a lot in one line (pattern matching) then in other
ways.  The more you play with it and use it the easier it will become.
Remember, your best friend is the book, Mastering Regular Expressions from
O'Reilly.  This book is in its 3rd edition released late last year.


Enjoy, Mike...



> Thanks for the reference Marty.  Reuben, sorry for not supplying a
> complete example I am at home.  I know that if you open your db and open
> the code part you can search for "regular expression" and get some
> results.  I got my information on how to use it here
>
> http://www.regular-expressions.info/vbscript.html
>
> Also I highly recommend the book, Mastering Regular Expressions
>
> http://www.oreilly.com/catalog/regex3/
>
> it is the bible for regular expressions.  You do not want to leave home
> without it!  :-O
>
> I will look at work and find some examples to pass on to you.
>
> Mike...
>





More information about the AccessD mailing list