Stuart McLachlan
stuart at lexacorp.com.pg
Tue Oct 21 20:34:24 CDT 2003
On 21 Oct 2003 at 20:32, John Colby wrote:
> Hmmm soundex would be good though. Do you have this function? Is it a
> class, a single function? How does it work?
>
I posted this to the SQLServer forum on 14 Aug 02:
The Soundex system is the means established by the National Archives to
index the U.S. censuses (beginning with 1880).
Soundex codes begin with the first letter of the surname followed by a
three-digit code that represents the (first three) remaining consonants
(after duplicate consonants have been combined). Where there are less
than three remaining consonants, the soundex is padded with 0s
Here's a VB/VBA function which shows the details:
Function SoundEx(ByVal incoming As String) As String
Dim lLoopcount As Long
Dim sResult As String
Dim sInChar As String
Dim sOutChar As String
'- Clean up the string
'
' 1. Remove all non-alphabetic characters (e.g., hyphens, apostrophes).
' 2. Convert all lower-case characters to upper-case characters.
' 3. Move the first letter in the source to the SOUNDEX output buffer.
' 4. Remove the vowels (A,E,I,O,U and Y) and the consonents H and W.
' 5. Make the following substitutions:
' Labials (B,F,P,V) ==> 1
' Gutterals, sibilants (C,G,J,K,Q,S,X,Z) ==> 2
' Dentals (D,T) ==> 3
' Long liquid (L) ==> 4
' Nasals (M,N) ==> 5
' Short liquid (R) ==> 6
' 6. Combine any adjacent identical digits (i.e., eliminate contiguous
' matching digits: for example, 44 becomes just 4).
incoming = UCase$(incoming)
sResult = ""
For lLoopcount = 1 To Len(incoming)
'- Stop when we get 4 characters
If Len(sResult) >= 4 Then Exit For
sInChar = Mid$(incoming, lLoopcount, 1)
Select Case sInChar
Case "B", "F", "P", "V"
sOutChar = "1"
Case "C", "G", "J", "K", "Q", "S", "X", "Z"
sOutChar = "2"
Case "D", "T"
sOutChar = "3"
Case "L"
sOutChar = "4"
Case "M", "N"
sOutChar = "5"
Case "R"
sOutChar = "6"
Case Else
sOutChar = ""
End Select
'- If it's the first digit, then just add it.
If sResult = "" Then
sResult = sResult + sInChar
'- Make sure not to add a repeating digit.
ElseIf Right$(sResult, 1) <> sOutChar Then
sResult = sResult + sOutChar
End If
Next
' 7. Concatenate the first three resulting digits to the SOUNDEX
' output buffer (if there are fewer than 3 resulting digits, pad with 0).
SoundEx = Left$(sResult + "0000", 4)
End Function
--
Stuart McLachlan
Lexacorp Ltd
Application Development, IT Consultancy
http://www.lexacorp.com.pg