[AccessD] "fuzzy logic" search

Stuart McLachlan stuart at lexacorp.com.pg
Tue Oct 21 20:34:24 CDT 2003


On 21 Oct 2003 at 20:32, John Colby wrote:

> Hmmm soundex would be good though.  Do you have this function?  Is it a
> class, a single function?  How does it work?
> 

I posted this to the SQLServer forum on 14 Aug 02:

The Soundex system is the means established by the National Archives to
index the U.S. censuses (beginning with 1880). 

Soundex codes begin with the first letter of the surname followed by a
three-digit code that represents the (first three) remaining consonants
(after duplicate consonants have been combined). Where there are less  
than three remaining consonants, the soundex is padded with 0s

Here's a VB/VBA function which shows the details:

Function SoundEx(ByVal incoming As String) As String

   Dim lLoopcount As Long
   Dim sResult As String
   Dim sInChar As String
   Dim sOutChar As String

'- Clean up the string
'
'  1.  Remove all non-alphabetic characters (e.g., hyphens, apostrophes). 
'  2.  Convert all lower-case characters to upper-case characters. 
'  3.  Move the first letter in the source to the SOUNDEX output buffer. 
'  4.  Remove the vowels (A,E,I,O,U and Y) and the consonents H and W. 
'  5.  Make the following substitutions: 
'       Labials (B,F,P,V)                      ==> 1 
'       Gutterals,  sibilants (C,G,J,K,Q,S,X,Z) ==> 2 
'       Dentals (D,T)   ==> 3 
'       Long liquid (L)                        ==> 4 
'       Nasals (M,N)                           ==> 5 
'       Short liquid (R)                       ==> 6 
'  6.  Combine any adjacent identical digits (i.e., eliminate contiguous 
'      matching digits:  for example, 44 becomes just 4). 

 incoming = UCase$(incoming) 
sResult  = "" 
For lLoopcount = 1 To Len(incoming)

      '- Stop when we get 4 characters
      If Len(sResult) >= 4 Then Exit For

      sInChar = Mid$(incoming, lLoopcount, 1)
      Select Case sInChar
           Case "B", "F", "P", "V"
                  sOutChar = "1"
           Case "C", "G", "J", "K", "Q", "S", "X", "Z"
                  sOutChar = "2"
           Case "D", "T"
                  sOutChar = "3"
           Case "L"
                  sOutChar = "4"
           Case "M", "N"
                  sOutChar = "5"
           Case "R"
                  sOutChar = "6"
           Case Else
                  sOutChar = ""
      End Select

      '- If it's the first digit, then just add it.
      If sResult = "" Then
             sResult = sResult + sInChar
             '- Make sure not to add a repeating digit.
      ElseIf Right$(sResult, 1) <> sOutChar Then
                  sResult = sResult + sOutChar
      End If

   Next

'  7.  Concatenate the first three resulting digits to the SOUNDEX 
' output buffer (if there are fewer than 3 resulting digits, pad with 0). 
 SoundEx = Left$(sResult + "0000", 4)

End Function



-- 
Stuart McLachlan
Lexacorp Ltd
Application Development,  IT Consultancy
http://www.lexacorp.com.pg



More information about the AccessD mailing list