Reverse-sync: push golden records back to your CRM

Close the loop — after you dedupe and resolve your CRM data into golden records, push the cleaned values back to HubSpot or Salesforce. Per-record UPDATE on a stable provider id, behind a dry-run + confirm gate.

Most of the funnel runs inbound: pull records from HubSpot, Salesforce, and other sources, match them, cluster duplicates, and produce golden records. Reverse-sync runs the funnel the other way — once you have a clean golden record, push the corrected values back to the CRM the data came from.

The canonical use case: "I deduped my HubSpot contacts — now update HubSpot with the survived values." Reverse-sync does a per-record UPDATE against a stable provider id, behind a dry-run preview and an explicit confirm gate, so you never silently overwrite a customer's CRM.

Note: Reverse-sync is UPDATE-only. Records without a provider id are skipped, never inserted — so a reverse run can't create junk rows in your CRM. Supported targets today: HubSpot (hubspot_dest) and Salesforce (salesforce_dest).

How it differs from a normal destination

A forward destination (warehouse, S3, etc.) does a bulk write of your golden table. Reverse-sync is different in two ways:

  1. Per-record UPDATE on a provider id. Each golden record carries the CRM's record id; we PATCH exactly that record. No bulk replace.
  2. Confirm gate. A dry-run shows what would change. You must approve before any write hits the CRM.

Because of that, reverse destinations have their own run path (/reverse-dry-run/reverse-run), not the forward /run. They are still rows in the destinations table, with destination_type set to hubspot_dest or salesforce_dest.

The one concept that matters: the provider id

A reverse UPDATE targets a CRM record by its provider id — the CRM's own internal record id, carried on the golden record. If that id isn't present (or isn't pristine), the push has nothing valid to update.

ProviderProvider-id columnWhere it comes from
HubSpoths_object_idPulled by the HubSpot reader on every contact/company/deal
SalesforceIdSelected by the Salesforce reader on every SObject

This is why the round-trip only works if you ingest and then resolve the source first: ingest writes the provider id into raw storage, and resolution carries it (untransformed) onto the golden record. Identifier columns like hs_object_id, Id, and *_id are treated as provenance and pass through resolution unmodified — they are never normalized, matched on, or transformed.

Prerequisites

1. A connected CRM source with write scope. See the per-connector setup — Connect HubSpot (a Private App access token pat-na1-… with crm.objects.contacts.read and crm.objects.contacts.write) or Connect Salesforce (OAuth with the api scope).

Warning: Write scope matters. The forward read only needs …read; reverse-sync UPDATEs the CRM, so the same source credential must also carry write scope. Reverse-sync reuses the source's stored credential — there is no second place to put a token.

2. Ingested + resolved data. Run the source (ingest), then resolve it into golden person (or company/deal) records. If you skip this, the dry-run will report 0 writable because no golden record carries a provider id yet.

End-to-end walkthrough

1. Ingest the source

Pull records into raw storage. This writes the provider id (hs_object_id / Id) alongside the other fields.

POST/api/golden/sources/{source_id}/ingestAUTH

2. Resolve into golden records

Cluster + survive the ingested records. Scope the resolve to just the CRM source so unrelated data isn't pooled in.

POST/api/golden/resolveAUTH
{ "entity_type": "person", "source_ids": ["<source_id>"] }

After this completes, each golden record carries a pristine hs_object_id / Id.

3. Create a reverse destination

There is no dedicated UI for reverse destinations yet — create one via the API. The config_json is where reverse-sync gets its instructions.

POST/api/golden/destinationsAUTH
{
  "name": "HubSpot reverse",
  "destination_type": "hubspot_dest",
  "entity_type": "person",
  "config_json": {
    "provider_id_field": "hs_object_id",
    "source_id": "<the source whose credential to reuse>",
    "object": "contacts",
    "fields": ["firstname", "lastname", "email", "company", "city"]
  }
}

No connection_string is needed for a reverse destination — credentials are reused from config_json.source_id. Missing provider_id_field or source_id returns 422.

4. Dry-run (no writes)

Preview which records would be updated. This needs no CRM auth and writes nothing — it short-circuits before credential resolution.

POST/api/golden/destinations/{id}/reverse-dry-runAUTH
{
  "dry_run": true,
  "total": 53,
  "writable": 2,
  "skipped": 51,
  "sample": [{ "provider_id": "492564112121", "fields": { "firstname": "Brian" } }]
}
  • writable — records carrying a provider id (these get UPDATEd).
  • skipped — records with no provider id (left untouched — UPDATE-only).

5. Confirmed push

When the plan looks right, fire the write. confirm: true is required — without it you get a 400 telling you to dry-run first.

POST/api/golden/destinations/{id}/reverse-runAUTH
{ "confirm": true }

Response:

{ "ok": true, "dry_run": false, "updated": 2, "skipped": 51, "errors": [] }

Each errors[] entry carries the provider_id and the CRM's response, so a partial failure tells you exactly which records failed and why.

config_json reference

KeyRequiredDefaultMeaning
provider_id_fieldyesGolden column holding the CRM record id (hs_object_id / Id)
source_idyesThe source whose stored credential to reuse for the write
objectnocontacts (HubSpot) / Contact (Salesforce)CRM object type to update
fieldsnoall writable columnsAllowlist — push only these columns. Strongly recommended (see below)

Field handling — what actually gets written

The writer never sends every column. It automatically drops:

  • the provider id itself (hs_object_id / Id) — that's the record key, not data
  • internal columns (anything prefixed __)
  • null values
  • read-only / system propertiesid, createdate, lastmodifieddate, hs_object_id (HubSpot); Id, CreatedDate, LastModifiedDate, SystemModstamp, IsDeleted, CreatedById, LastModifiedById (Salesforce). Sending these would 400 the whole record.

If you set config_json.fields, the push is restricted to exactly that list (after the drops above). A record that nets zero writable fields is skipped rather than PATCHed empty.

Warning: Enum properties need the internal value, not the label. HubSpot's lifecyclestage, for example, accepts lead (internal) but rejects Lead (the display label) with a "property values were not valid" 400. Until value-mapping lands, the simplest path is to exclude enum fields from config_json.fields and push only free-text properties (firstname, lastname, email, company, city, state, zip, country).

Safety model

  • Confirm gate — dry-run preview, then explicit confirm: true. Nothing writes without it.
  • UPDATE-only — records without a provider id are skipped; no inserts, ever.
  • Audit log — every reverse run (success or failure) writes an audit_log row (destination.reverse_run / destination.reverse_run.failed) for org-scoped callers.
  • Quotas + rate limits — reverse runs count against your plan's destination-run quota and are rate-limited (30/hour).
  • Org scoping — org-scoped reverse destinations push the org's pooled golden records; solo destinations stay creator-scoped.

Current limitations

  • Targets: HubSpot and Salesforce only.
  • UPDATE-only: no record creation; records without a provider id are skipped.
  • API-driven: reverse destinations are created and run via the API — no dedicated UI panel yet.
  • Enum fields require manual exclusion (no label→internal-value mapping yet).
  • Per-record PATCH: writes go one record at a time (no HubSpot batch / Salesforce composite batching yet).

Troubleshooting

SymptomCauseFix
Dry-run shows writable: 0Golden records have no provider idIngest and resolve the CRM source first
422 on createconfig_json missing provider_id_field or source_idAdd both keys
400 on reverse-runCalled without confirm: trueDry-run first, then send { "confirm": true }
Per-record error "property values were not valid"An enum/read-only property in the payloadRestrict config_json.fields to free-text columns
Per-record error 401Source credential lacks write scopeRe-issue the token with write scope and re-store it