infermap
infermap maps source columns to a target schema using header similarity, value distributions, and type analysis. No manual field alignment needed.
infermap automatically maps columns from a source schema to a target schema using a weighted scoring pipeline — header similarity, value distributions, and data type matching.
Note: infermap is what lets the funnel pool heterogeneous sources — HubSpot's
firstnameand Salesforce'sFirstNamemap to the same target field automatically, so the matcher sees one consistent schema. Part of the autoconfig step in the hosted Workbench; also available standalone viapip install infermap.
Basic Usage
import infermap
mapping = infermap.map_schema(
source="raw_data.csv",
target_schema=["first_name", "last_name", "email", "phone"]
)
print(mapping)
Try It
import infermap
mapping = infermap.map_schema("data.csv", target_schema=["name", "email", "phone"])
print(mapping)Scoring Pipeline
infermap uses three signals to score column matches:
| Signal | Weight | Description |
|---|---|---|
| Header similarity | 0.4 | Fuzzy string matching on column names |
| Value distribution | 0.35 | Statistical comparison of value patterns |
| Type inference | 0.25 | Data type compatibility |
Note: Override default weights with the
weightsparameter for domain-specific tuning.