Customer 360 with Golden Suite

Build a unified customer view from your CRM + billing + support + marketing systems. The honest playbook, not the slideware version.

"Customer 360" is the most common MDM use case: pull customer data from every system that touches a customer, resolve the duplicates, produce one record per real-world person or company. This guide walks the actual sequence on Golden Suite.

The customer-360 problem in 30 seconds

You probably have at least four of these:

  • CRM (Salesforce, HubSpot) — sales-shaped: account, contacts, opportunities
  • Billing (Stripe, Chargebee) — payment-shaped: customer, subscription, charges
  • Support (Zendesk, Intercom) — ticket-shaped: requester, organization
  • Product analytics (Mixpanel, PostHog) — event-shaped: user IDs
  • Marketing automation (Mailchimp, Customer.io) — list-shaped: lead, contact
  • Data warehouse (Snowflake, BigQuery) — table-shaped: whatever the BI team materialized

Each system thinks "Acme Corp" is one customer. Each spelled it differently. Each has slightly different contact info. Nobody can answer "what's Acme's annual revenue across our products?" without manual joining.

Customer 360 is the project that fixes that.

Setup sequence

Week 1 — Source landing

  1. Sign up at bensevern.dev and create a project (e.g., "Acme — customer master").
  2. Add your CRM as the first source. Salesforce + HubSpot have native OAuth connectors; SugarCRM and Pipedrive too. CSV upload is fine for a first pass if the connector isn't wired.
  3. Add billing. Stripe is the most common; QuickBooks and Xero connectors exist for SMB customers.
  4. Add the warehouse table. Most teams land here too — it's a useful "source of last resort" since marketing-team-managed tables often have richer customer data than the upstream systems.

After this, the workbench has 3+ sources loaded. Don't go further until you've eyeballed each source's profile in GoldenCheck. If you find null-rates > 50% on critical fields, fix that upstream before resolving — your golden records will be only as good as the input.

Week 2 — First resolve

  1. Run goldenmatch.dedupe_df on the combined sources. The auto-config will propose blocking signals (probably email domain + first-3-of-name) and scorer weights.
  2. Inspect the postflight. The PostflightBanner at the top of the workbench will surface ambiguous merges, demoted scorers, and warnings. You will see ambiguous merges — this is normal for first runs.
  3. Approve, split, or merge each ambiguous-merge cluster in the review queue. Track patterns: if a particular field is causing repeated ambiguity, the scorer weights for that field are mis-tuned.
  4. Adjust scorer weights if needed. The "Adjust & re-run" link on the postflight modal sends you back to the autoconfig wizard with the original event's params pre-loaded.

After Week 2, you should have golden records for 80%+ of your customer base with high confidence. The remaining 20% need either better data, additional sources, or human stewardship.

Week 3 — Survivorship + lineage

  1. Configure survivorship rules per field. Sensible defaults for customer-360:
    • legal_name — source priority (registry of incorporation > CRM > anywhere else)
    • email — most recent
    • phone — most recent
    • annual_revenue — source priority (finance system > sales-claimed)
    • industry — most complete
    • address — most complete + most recent tie-break
  2. Re-run the pipeline with the survivorship rules in place.
  3. Inspect the Lineage tab on a few golden records to verify the field-level choices make sense. The audit trail will surface every survivorship decision.

Week 4 — Downstream wiring

  1. Export golden records to your warehouse (BigQuery, Snowflake) via the CSV+JSON export endpoint. Schedule it as part of your dbt pipeline.
  2. Replace upstream "which customer is this?" joins with lookups against the exported golden table.
  3. Add a webhook (Phase 10 — when shipped) so downstream consumers get notified of golden-record changes in real time. Until webhooks land, daily exports are the path.

Common pitfalls

  • Trying to resolve on free-text fields first. "Company name" varies more than you think. Start with email + phone as primary identifiers, fall back to name only when both are missing.
  • Ignoring the ambiguous-merge bucket. Auto-approving everything inflates precision and tanks recall. Stewards reviewing ambiguous merges weekly is the right operating cadence.
  • Not refreshing survivorship rules. What was the right rule in Week 3 may be wrong in Week 12 once new sources land. Quarterly review is the minimum.
  • Treating the CRM as canonical. It isn't. CRM data is what salespeople type; it has confident typos. Use registry/billing data as the canonical source for legal-name and identifier fields; let the CRM win for behavioral fields (last activity, lifecycle stage).

What "good" looks like at the end

  • One golden customer record per real-world entity, with measurable confidence per cluster
  • Field-level lineage walkable in the UI: who said what, when, and why we chose this value
  • Ambiguous-merge backlog under 1% of total clusters
  • A daily export feeding downstream systems
  • A small but explicit list of stewards who own the review queue

This is the workflow ~80% of MDM projects need. The remaining 20% (cross-org matching via PPRL, real-time event-sourced golden-record changes, regulatory audit chain) layer on top of this foundation but require it first.

Next steps