← Glossary

Probabilistic matching

Matching that assigns a numeric similarity score to each candidate pair, instead of a binary yes/no.

Probabilistic matching is the modern default. Each candidate pair gets a score (0-1) from a combination of per-field scorers — fuzzy name match, exact email match, address similarity, etc. A configurable threshold separates "definitely a match" from "needs review."

This contrasts with deterministic matching, which only counts exact-key equality (same SSN, same email). Probabilistic handles real-world messiness (typos, formatting differences, missing fields) where deterministic gives up.

The downside: probabilistic matching requires threshold tuning and produces an "ambiguous" middle band that needs human review. That's a feature, not a bug — it surfaces the cases a deterministic system would silently miss.