Jaro-Winkler similarity
A string similarity score (0-1) that favors matches with identical prefixes — well-suited to person and company names.
Jaro-Winkler is built on Jaro similarity, which counts matching characters within a sliding window plus the number of transpositions. The Winkler modification boosts the score when the strings share an identical prefix (up to 4 characters), under the empirical observation that name typos tend to happen in the middle/end, not the start.
The score ranges 0 (no similarity) to 1 (identical). 0.85+ is typically a confident match for names; 0.7-0.85 is the ambiguous middle. Exact thresholds depend on dataset and language.
Jaro-Winkler is the default scorer for name fields in most modern matching libraries because the prefix bias matches how real names get misspelled. It's a poor choice for address or free-text fields — use token-set or fuzzy-substring scorers there.