[AccessD] Fuzzy Matching (Like Soundex) or other ideas?
Stuart McLachlan
stuart at lexacorp.com.pg
Sat Apr 22 00:40:55 CDT 2023
On 22 Apr 2023 at 15:25, Stuart McLachlan wrote:
> Oops, it's a long time since I did it.
>
> If the strings are different lengths, it's the levenshtein distance :)
>
>
Or you could go a step further with Damerau-Levenshtein distance
https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance
(The Damerau-Levenshtein distance differs from the classical Levenshtein distance
by including transpositions among its allowable operations in addition to the three
classical single-character edit operations (insertions, deletions and substitutions) )
But for plain Lenenschtein, Here's some very old PB code, it should be trivial to adapt to VBA
FUNCTION fn_LevenshteinDistance( BYVAL strText1 AS STRING, BYVAL strText2 AS
STRING, OPT lngCaseMatters AS LONG ) AS LONG
LOCAL lngText1Idx AS LONG
LOCAL lngText1Len AS LONG
LOCAL lngText2Idx AS LONG
LOCAL lngText2Len AS LONG
LOCAL pbytText1Char AS BYTE PTR
LOCAL pbytText2Char AS BYTE PTR
lngText1Len = LEN( strText1 )
lngText2Len = LEN( strText2 )
IF ( lngText1Len = 0 ) OR ( lngText2Len = 0 ) THEN FUNCTION = MAX%( lngText1Len,
lngText2Len ) : EXIT FUNCTION
IF ISFALSE( ISMISSING( lngCaseMatters )) AND ISFALSE( lngCaseMatters ) THEN
strText1 = UCASE$( strText1 )
strText2 = UCASE$( strText2 )
END IF
DIM lngMatrix( lngText1Len, lngText2Len ) AS LONG
FOR lngText1Idx = 0 TO lngText1Len
lngMatrix( lngText1Idx, 0 ) = lngText1Idx
NEXT lngText1Idx
FOR lngText2Idx = 0 TO lngText2Len
lngMatrix( 0, lngText2Idx ) = lngText2Idx
NEXT lngText2Idx
pbytText1Char = STRPTR( strText1 )
pbytText2Char = STRPTR( strText2 )
FOR lngText1Idx = 1 TO lngText1Len
FOR lngText2Idx = 1 TO lngText2Len
lngMatrix( lngText1Idx, lngText2Idx ) = _
MIN%( lngMatrix( lngText1Idx - 1, lngText2Idx ) + 1, _
lngMatrix( lngText1Idx, lngText2Idx - 1 ) + 1, _
lngMatrix( lngText1Idx - 1, lngText2Idx - 1 ) + _
ABS( @pbytText1Char[ lngText1Idx - 1 ] <> @pbytText2Char[ lngText2Idx - 1 ] ))
NEXT lngText2Idx
NEXT lngText1Idx
FUNCTION = lngMatrix( lngText1Len, lngText2Len )
END FUNCTION
More information about the AccessD
mailing list