infermap

Automatic schema mapping with weighted scoring.

infermap automatically maps columns from a source schema to a target schema using a weighted scoring pipeline — header similarity, value distributions, and data type matching.

Basic Usage

import infermap

mapping = infermap.map_schema(
    source="raw_data.csv",
    target_schema=["first_name", "last_name", "email", "phone"]
)
print(mapping)

Try It

infermap demo
import infermap
mapping = infermap.map_schema("data.csv", target_schema=["name", "email", "phone"])
print(mapping)

Scoring Pipeline

infermap uses three signals to score column matches:

SignalWeightDescription
Header similarity0.4Fuzzy string matching on column names
Value distribution0.35Statistical comparison of value patterns
Type inference0.25Data type compatibility

Note: Override default weights with the weights parameter for domain-specific tuning.