← Glossary

Data standardization

Cleaning and normalizing field values to a canonical form before matching — phone E.164, address USPS-standard, name title-case.

Standardization is the pre-match step that gives the matching engine a fighting chance. Common transforms:

  • Phone numbers → E.164 format ("+14155551234")
  • Addresses → USPS-standard (or country equivalent) with abbreviations expanded
  • Names → title-case with consistent punctuation
  • Dates → ISO 8601
  • Emails → lowercase, common-typo-domain corrections (gnail → gmail)

Without standardization, fuzzy matching gets confused by formatting differences ("+1-415-555-1234" vs "(415) 555-1234"). A 10-second standardization pass converts those into the same canonical value and the match becomes deterministic.

Golden Suite's GoldenFlow tool handles this with named transforms — phone, address, email — registered per-field. Custom transforms register as functions following the same interface.