infermap

infermap maps source columns to a target schema using header similarity, value distributions, and type analysis. No manual field alignment needed.

infermap automatically maps columns from a source schema to a target schema using a weighted scoring pipeline — header similarity, value distributions, and data type matching.

Note: infermap is what lets the funnel pool heterogeneous sources — HubSpot's firstname and Salesforce's FirstName map to the same target field automatically, so the matcher sees one consistent schema. Part of the autoconfig step in the hosted Workbench; also available standalone via pip install infermap.

Basic Usage

import infermap

mapping = infermap.map_schema(
    source="raw_data.csv",
    target_schema=["first_name", "last_name", "email", "phone"]
)
print(mapping)

Try It

infermap demo
import infermap
mapping = infermap.map_schema("data.csv", target_schema=["name", "email", "phone"])
print(mapping)

Scoring Pipeline

infermap uses three signals to score column matches:

SignalWeightDescription
Header similarity0.4Fuzzy string matching on column names
Value distribution0.35Statistical comparison of value patterns
Type inference0.25Data type compatibility

Note: Override default weights with the weights parameter for domain-specific tuning.

Was this page helpful?
Edit this page on GitHub