[AccessD] Fuzzy Matching (Like Soundex) or other ideas?

Stuart McLachlan stuart at lexacorp.com.pg
Sat Apr 22 00:40:55 CDT 2023


On 22 Apr 2023 at 15:25, Stuart McLachlan wrote:

> Oops, it's a long time since I did it.
> 
> If the strings are different lengths, it's the levenshtein distance :)
> 
> 
Or you could go a step further with Damerau-Levenshtein distance
https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance

(The Damerau-Levenshtein distance differs from the classical Levenshtein distance 
by including transpositions among its allowable operations in addition to the three 
classical single-character edit operations (insertions, deletions and substitutions) )


But for plain Lenenschtein, Here's some very old PB code, it should be trivial to adapt to VBA

FUNCTION fn_LevenshteinDistance( BYVAL strText1 AS STRING, BYVAL strText2 AS 
STRING, OPT lngCaseMatters AS LONG ) AS LONG
  LOCAL lngText1Idx   AS LONG
  LOCAL lngText1Len   AS LONG
  LOCAL lngText2Idx   AS LONG
  LOCAL lngText2Len   AS LONG
  LOCAL pbytText1Char AS BYTE PTR
  LOCAL pbytText2Char AS BYTE PTR
  lngText1Len = LEN( strText1 )
  lngText2Len = LEN( strText2 )
  IF ( lngText1Len = 0 ) OR ( lngText2Len = 0 ) THEN FUNCTION = MAX%( lngText1Len, 
lngText2Len ) : EXIT FUNCTION
  IF ISFALSE( ISMISSING( lngCaseMatters )) AND ISFALSE( lngCaseMatters ) THEN
    strText1 = UCASE$( strText1 )
    strText2 = UCASE$( strText2 )
  END IF
  DIM lngMatrix( lngText1Len, lngText2Len ) AS LONG
  FOR lngText1Idx = 0 TO lngText1Len
    lngMatrix( lngText1Idx, 0 ) = lngText1Idx
  NEXT lngText1Idx
  FOR lngText2Idx = 0 TO lngText2Len
    lngMatrix( 0, lngText2Idx ) = lngText2Idx
  NEXT lngText2Idx
  pbytText1Char = STRPTR( strText1 )
  pbytText2Char = STRPTR( strText2 )
  FOR lngText1Idx = 1 TO lngText1Len
    FOR lngText2Idx = 1 TO lngText2Len
      lngMatrix( lngText1Idx, lngText2Idx ) = _
        MIN%( lngMatrix( lngText1Idx - 1, lngText2Idx     ) + 1, _
          lngMatrix( lngText1Idx,     lngText2Idx - 1 ) + 1, _
          lngMatrix( lngText1Idx - 1, lngText2Idx - 1 ) + _ 
          ABS( @pbytText1Char[ lngText1Idx - 1 ] <> @pbytText2Char[ lngText2Idx - 1 ] ))
    NEXT lngText2Idx
  NEXT lngText1Idx
  FUNCTION = lngMatrix( lngText1Len, lngText2Len )
END FUNCTION





More information about the AccessD mailing list