← Glossary

Schema inference

Automatic proposal of a target-schema mapping for a new source — "this column looks like an email, this one looks like a date of birth."

Schema inference uses the source's column names + value patterns to propose a target-schema mapping without the user filling in a form. Modern inference engines combine:

  • Column-name fuzzy matching ("EmailAddr" → "email")
  • Value-pattern detection (RFC 5322 regex → email; ISO 8601 → date)
  • Distribution checks (high cardinality + free-text → name; low cardinality + 50 known values → country)
  • LLM-assisted suggestions on ambiguous columns

The output is a proposed mapping with confidence per column, surfaced as a reviewable diff before commit. The user can approve, edit, or reject any specific mapping.

In Golden Suite this is the InferMap tool. Schema inference dramatically reduces onboarding time for new sources, especially at organizations that ingest data from dozens of systems.