GoldenPipe

GoldenPipe builds composable transformation pipelines. Chain steps to clean, normalize, and reshape data. Designed to feed clean records to GoldenMatch.

GoldenPipe builds composable data transformation pipelines — chain steps together to clean, normalize, and reshape data.

Note: In the funnel, GoldenPipe is the transform stage between ingest and match — it normalizes each source into the shape the matcher expects. Self-host with pip install goldenpipe, or let the hosted Workbench run it as part of the pipeline.

Basic Usage

import goldenpipe

pipe = goldenpipe.Pipeline("raw_data.csv")
pipe.add_step("normalize_names")
pipe.add_step("standardize_addresses")
pipe.add_step("deduplicate_emails")
result = pipe.run()

Try It

goldenpipe demo
import goldenpipe
pipe = goldenpipe.Pipeline("data.csv")
pipe.add_step("normalize_names")
result = pipe.run()
print(result.summary())

Built-in Steps

StepDescription
normalize_namesStandardize name casing and formatting
standardize_addressesParse and normalize address components
deduplicate_emailsRemove duplicate email entries
trim_whitespaceStrip leading/trailing whitespace
parse_datesNormalize date formats

Tip: Steps execute in order — put cleanup steps before matching steps.

Was this page helpful?
Edit this page on GitHub