infermap
Automatic schema mapping with weighted scoring.
infermap automatically maps columns from a source schema to a target schema using a weighted scoring pipeline — header similarity, value distributions, and data type matching.
Basic Usage
import infermap
mapping = infermap.map_schema(
source="raw_data.csv",
target_schema=["first_name", "last_name", "email", "phone"]
)
print(mapping)
Try It
infermap demo
import infermap
mapping = infermap.map_schema("data.csv", target_schema=["name", "email", "phone"])
print(mapping)Scoring Pipeline
infermap uses three signals to score column matches:
| Signal | Weight | Description |
|---|---|---|
| Header similarity | 0.4 | Fuzzy string matching on column names |
| Value distribution | 0.35 | Statistical comparison of value patterns |
| Type inference | 0.25 | Data type compatibility |
Note: Override default weights with the
weightsparameter for domain-specific tuning.