Stuart McLachlan
stuart at lexacorp.com.pg
Tue Oct 21 20:34:24 CDT 2003
On 21 Oct 2003 at 20:32, John Colby wrote: > Hmmm soundex would be good though. Do you have this function? Is it a > class, a single function? How does it work? > I posted this to the SQLServer forum on 14 Aug 02: The Soundex system is the means established by the National Archives to index the U.S. censuses (beginning with 1880). Soundex codes begin with the first letter of the surname followed by a three-digit code that represents the (first three) remaining consonants (after duplicate consonants have been combined). Where there are less than three remaining consonants, the soundex is padded with 0s Here's a VB/VBA function which shows the details: Function SoundEx(ByVal incoming As String) As String Dim lLoopcount As Long Dim sResult As String Dim sInChar As String Dim sOutChar As String '- Clean up the string ' ' 1. Remove all non-alphabetic characters (e.g., hyphens, apostrophes). ' 2. Convert all lower-case characters to upper-case characters. ' 3. Move the first letter in the source to the SOUNDEX output buffer. ' 4. Remove the vowels (A,E,I,O,U and Y) and the consonents H and W. ' 5. Make the following substitutions: ' Labials (B,F,P,V) ==> 1 ' Gutterals, sibilants (C,G,J,K,Q,S,X,Z) ==> 2 ' Dentals (D,T) ==> 3 ' Long liquid (L) ==> 4 ' Nasals (M,N) ==> 5 ' Short liquid (R) ==> 6 ' 6. Combine any adjacent identical digits (i.e., eliminate contiguous ' matching digits: for example, 44 becomes just 4). incoming = UCase$(incoming) sResult = "" For lLoopcount = 1 To Len(incoming) '- Stop when we get 4 characters If Len(sResult) >= 4 Then Exit For sInChar = Mid$(incoming, lLoopcount, 1) Select Case sInChar Case "B", "F", "P", "V" sOutChar = "1" Case "C", "G", "J", "K", "Q", "S", "X", "Z" sOutChar = "2" Case "D", "T" sOutChar = "3" Case "L" sOutChar = "4" Case "M", "N" sOutChar = "5" Case "R" sOutChar = "6" Case Else sOutChar = "" End Select '- If it's the first digit, then just add it. If sResult = "" Then sResult = sResult + sInChar '- Make sure not to add a repeating digit. ElseIf Right$(sResult, 1) <> sOutChar Then sResult = sResult + sOutChar End If Next ' 7. Concatenate the first three resulting digits to the SOUNDEX ' output buffer (if there are fewer than 3 resulting digits, pad with 0). SoundEx = Left$(sResult + "0000", 4) End Function -- Stuart McLachlan Lexacorp Ltd Application Development, IT Consultancy http://www.lexacorp.com.pg