Pipe your messy SaaS data in.
Push golden records out.
Connect HubSpot, Salesforce, Stripe, Cvent, Bizzabo, ad-hoc Postgres — pick a destination warehouse or a Parquet drop. We match, cluster, and surface the un-auto-merged clusters in your review queue. You label, the system learns. Your data stays yours: bensevern.dev is the collector and filter, never the source of truth.
Onboarding inertia is near-zero — the multi-wave autoconfig (goldenmatch 1.18) proposes match rules + survivorship + scorer weights from your data shape. No thresholds to tune, no DSL to learn before a first useful run.
- 22+
- source connectors live
- 7
- warehouse + cloud destinations
- B³ 0.95
- B-Cubed F1 · real NC voter data
- $99
- /mo Pro · free tier
HubSpot · Salesforce · Stripe · Pipedrive · Shopify · Klaviyo · S3 · GCS · Azure Blob · SFTP · Postgres · MySQL · Snowflake · BigQuery
Autoconfig proposes match rules from your schema. Ambiguous clusters surface in a review queue; your labels feed back into the next run's scorer.
Postgres · MySQL · Snowflake · BigQuery · S3 / GCS / Azure (CSV or Parquet) · browser-download CSV / Excel / Parquet. Truncate-and-load or append.
Built for
- RevOps at mid-market B2B SaaS, where duplicate accounts double-count the forecast and break attribution
- The ops analyst who hand-dedupes the campaign or account list in Excel before every send and board deck
- PE-backed platforms merging customer and vendor records across post-acquisition systems on a synergy clock
- Data teams who want MDM-grade dedup without standing up an MDM platform
Not built for
- Fortune 500 with a dedicated MDM team + Reltio license
- HIPAA / PHI workloads (no BAA yet)
- Real-time streaming match (batch only)
- Source-of-truth storage — we filter and pass through, not host
See it find the duplicates
The same matcher the workbench runs, on a sample CRM. Drag the strictness slider and watch the duplicate clusters form. No signup, nothing to install. Then bring your own file.
4 duplicate groups · 10/10 records merged · 0 unique
strictness 0.70What the SaaS adds on top of the open-source engine
The matching engine is five MIT-licensed Python packages — self-host them if you want to wire your own funnel. The hosted version wraps them in the connectors, review queue, audit trail, and destination push so you can ship without writing the plumbing.
Source connectors
22 connectors on the ingest side: HubSpot, Salesforce, Stripe, Pipedrive, Klaviyo, Shopify, S3, GCS, Azure Blob, Postgres, MySQL, Snowflake, BigQuery, and more. OAuth or API key — both flows handled.
Warehouse + file destinations
7 destinations on the push side: Postgres / MySQL / Snowflake / BigQuery for warehouses; S3 / GCS / Azure Blob (CSV or Parquet) for file drops. Plus browser CSV / Excel / Parquet download.
Review queue + steward UX
Ambiguous clusters surface for a human call. Approve, split, or merge — decisions land in the audit trail and the next destination push.
Autoconfig — no DSL to learn
Multi-wave autoconfig proposes match rules, scorer weights, and survivorship from your schema shape. First useful run before you read the docs.
Multi-tenant + cryptographic audit
Clerk orgs, per-org quotas, plan-aware rate limits. Append-only audit_log with per-org SHA-256 chain. Verify the trail end-to-end.
You keep your data — and the engine
We collect, filter, hand back. The warehouse is your source of truth. The matching engine is MIT-licensed: if we go away, you keep running. Try that with Reltio.
The orchestrator inside the funnel
Sources flow in. Autoconfig proposes the match shape from your schema. The matcher clusters records, the review queue catches the ambiguous ones, golden records materialize. Then the destination push lands them in your warehouse.
Event-sourced notebooks — every tool call is a step you can replay, fork, rewind, or audit. Trust gates classify mutating tools into auto / gated / confirm based on the notebook's trust tier.
Three funnel use-cases we're shipping for
RevOps · CRM dedupe
HubSpot + Salesforce + Stripe all carry overlapping customer records. Pull all three in, surface the ambiguous merges, push a clean golden table to Snowflake or S3 nightly. Cheaper than DemandTools, faster than waiting on IT.
Read moreEvent ops · attendee consolidation
Cvent + Bizzabo + Eventbrite + your CRM all have their own version of the same person. Merge them into a single attendee table you can analyse — without standing up a warehouse-side dbt model first.
Read morePE platforms · M&A integration
Just closed your third add-on. Customer and vendor records scattered across the acquired companies' systems. Need a 90-day merge story for the deal team — pipe each into the funnel, push to one consolidated warehouse.
Read moreRun the funnel in under five minutes.
Free tier includes 3 source connectors, 2 destinations, the full review queue, and a demo project that walks you through the loop. No DSL to learn first — autoconfig proposes the match shape from your schema.