GoldenPipe
GoldenPipe builds composable transformation pipelines. Chain steps to clean, normalize, and reshape data. Designed to feed clean records to GoldenMatch.
GoldenPipe builds composable data transformation pipelines — chain steps together to clean, normalize, and reshape data.
Note: In the funnel, GoldenPipe is the transform stage between ingest and match — it normalizes each source into the shape the matcher expects. Self-host with
pip install goldenpipe, or let the hosted Workbench run it as part of the pipeline.
Basic Usage
import goldenpipe
pipe = goldenpipe.Pipeline("raw_data.csv")
pipe.add_step("normalize_names")
pipe.add_step("standardize_addresses")
pipe.add_step("deduplicate_emails")
result = pipe.run()
Try It
goldenpipe demo
import goldenpipe
pipe = goldenpipe.Pipeline("data.csv")
pipe.add_step("normalize_names")
result = pipe.run()
print(result.summary())Built-in Steps
| Step | Description |
|---|---|
normalize_names | Standardize name casing and formatting |
standardize_addresses | Parse and normalize address components |
deduplicate_emails | Remove duplicate email entries |
trim_whitespace | Strip leading/trailing whitespace |
parse_dates | Normalize date formats |
Tip: Steps execute in order — put cleanup steps before matching steps.
Was this page helpful?
Edit this page on GitHub