← Glossary

Entity resolution(ER)

The process of identifying which records across one or more datasets refer to the same real-world entity.

Entity resolution (ER) — sometimes called record linkage, deduplication, or identity resolution — takes messy data with duplicates, typos, and formatting variations and groups together the records that describe the same underlying person, company, product, or location.

A typical ER pipeline runs in three stages: blocking narrows the candidate-pair space (you can't compare every pair of 50,000 rows), scoring computes per-field similarity for each candidate pair, and clustering turns high-confidence pairs into connected groups.

ER is the technical foundation of master data management — without it, "one trusted version" of a customer is impossible.