Reverse-sync: push golden records back to your CRM

Close the loop — after you dedupe and resolve your CRM data into golden records, push the cleaned values back to HubSpot or Salesforce. Per-record UPDATE on a stable provider id, behind a dry-run + confirm gate.

Most of the funnel runs inbound: pull records from HubSpot, Salesforce, and other sources, match them, cluster duplicates, and produce golden records. Reverse-sync runs the funnel the other way — once you have a clean golden record, push the corrected values back to the CRM the data came from.

The canonical use case: "I deduped my HubSpot contacts — now update HubSpot with the survived values." Reverse-sync does a per-record UPDATE against a stable provider id, behind a dry-run preview and an explicit confirm gate, so you never silently overwrite a customer's CRM.

Note: Reverse-sync is UPDATE-only. Records without a provider id are skipped, never inserted — so a reverse run can't create junk rows in your CRM. Supported targets today: HubSpot (hubspot_dest) and Salesforce (salesforce_dest).

How it differs from a normal destination

A forward destination (warehouse, S3, etc.) does a bulk write of your golden table. Reverse-sync is different in two ways:

Per-record UPDATE on a provider id. Each golden record carries the CRM's record id; we PATCH exactly that record. No bulk replace.
Confirm gate. A dry-run shows what would change. You must approve before any write hits the CRM.

Because of that, reverse destinations have their own run path (/reverse-dry-run → /reverse-run), not the forward /run. They are still rows in the destinations table, with destination_type set to hubspot_dest or salesforce_dest.

The one concept that matters: the provider id

A reverse UPDATE targets a CRM record by its provider id — the CRM's own internal record id, carried on the golden record. If that id isn't present (or isn't pristine), the push has nothing valid to update.

Provider	Provider-id column	Where it comes from
HubSpot	`hs_object_id`	Pulled by the HubSpot reader on every contact/company/deal
Salesforce	`Id`	Selected by the Salesforce reader on every SObject

This is why the round-trip only works if you ingest and then resolve the source first: ingest writes the provider id into raw storage, and resolution carries it (untransformed) onto the golden record. Identifier columns like hs_object_id, Id, and *_id are treated as provenance and pass through resolution unmodified — they are never normalized, matched on, or transformed.

Prerequisites

1. A connected CRM source with write scope. See the per-connector setup — Connect HubSpot (a Private App access token pat-na1-… with crm.objects.contacts.read and crm.objects.contacts.write) or Connect Salesforce (OAuth with the api scope).

Warning: Write scope matters. The forward read only needs …read; reverse-sync UPDATEs the CRM, so the same source credential must also carry write scope. Reverse-sync reuses the source's stored credential — there is no second place to put a token.

2. Ingested + resolved data. Run the source (ingest), then resolve it into golden person (or company/deal) records. If you skip this, the dry-run will report 0 writable because no golden record carries a provider id yet.

End-to-end walkthrough

1. Ingest the source

Pull records into raw storage. This writes the provider id (hs_object_id / Id) alongside the other fields.

2. Resolve into golden records

Cluster + survive the ingested records. Scope the resolve to just the CRM source so unrelated data isn't pooled in.

{ "entity_type": "person", "source_ids": ["<source_id>"] }

After this completes, each golden record carries a pristine hs_object_id / Id.

3. Create a reverse destination

There is no dedicated UI for reverse destinations yet — create one via the API. The config_json is where reverse-sync gets its instructions.

{
  "name": "HubSpot reverse",
  "destination_type": "hubspot_dest",
  "entity_type": "person",
  "config_json": {
    "provider_id_field": "hs_object_id",
    "source_id": "<the source whose credential to reuse>",
    "object": "contacts",
    "fields": ["firstname", "lastname", "email", "company", "city"]
  }
}

No connection_string is needed for a reverse destination — credentials are reused from config_json.source_id. Missing provider_id_field or source_id returns 422.

4. Dry-run (no writes)

Preview which records would be updated. This needs no CRM auth and writes nothing — it short-circuits before credential resolution.

{
  "dry_run": true,
  "total": 53,
  "writable": 2,
  "skipped": 51,
  "sample": [{ "provider_id": "492564112121", "fields": { "firstname": "Brian" } }]
}

writable — records carrying a provider id (these get UPDATEd).
skipped — records with no provider id (left untouched — UPDATE-only).

5. Confirmed push

When the plan looks right, fire the write. confirm: true is required — without it you get a 400 telling you to dry-run first.

{ "confirm": true }

Response:

{ "ok": true, "dry_run": false, "updated": 2, "skipped": 51, "errors": [] }

Each errors[] entry carries the provider_id and the CRM's response, so a partial failure tells you exactly which records failed and why.

`config_json` reference

Key	Required	Default	Meaning
`provider_id_field`	yes	—	Golden column holding the CRM record id (`hs_object_id` / `Id`)
`source_id`	yes	—	The source whose stored credential to reuse for the write
`object`	no	`contacts` (HubSpot) / `Contact` (Salesforce)	CRM object type to update
`fields`	no	all writable columns	Allowlist — push only these columns. Strongly recommended (see below)

Field handling — what actually gets written

The writer never sends every column. It automatically drops:

the provider id itself (hs_object_id / Id) — that's the record key, not data
internal columns (anything prefixed __)
null values
read-only / system properties — id, createdate, lastmodifieddate, hs_object_id (HubSpot); Id, CreatedDate, LastModifiedDate, SystemModstamp, IsDeleted, CreatedById, LastModifiedById (Salesforce). Sending these would 400 the whole record.

If you set config_json.fields, the push is restricted to exactly that list (after the drops above). A record that nets zero writable fields is skipped rather than PATCHed empty.

Warning: Enum properties need the internal value, not the label. HubSpot's lifecyclestage, for example, accepts lead (internal) but rejects Lead (the display label) with a "property values were not valid" 400. Until value-mapping lands, the simplest path is to exclude enum fields from config_json.fields and push only free-text properties (firstname, lastname, email, company, city, state, zip, country).

Safety model

Confirm gate — dry-run preview, then explicit confirm: true. Nothing writes without it.
UPDATE-only — records without a provider id are skipped; no inserts, ever.
Audit log — every reverse run (success or failure) writes an audit_log row (destination.reverse_run / destination.reverse_run.failed) for org-scoped callers.
Quotas + rate limits — reverse runs count against your plan's destination-run quota and are rate-limited (30/hour).
Org scoping — org-scoped reverse destinations push the org's pooled golden records; solo destinations stay creator-scoped.

Current limitations

Targets: HubSpot and Salesforce only.
UPDATE-only: no record creation; records without a provider id are skipped.
API-driven: reverse destinations are created and run via the API — no dedicated UI panel yet.
Enum fields require manual exclusion (no label→internal-value mapping yet).
Per-record PATCH: writes go one record at a time (no HubSpot batch / Salesforce composite batching yet).

Troubleshooting

Symptom	Cause	Fix
Dry-run shows `writable: 0`	Golden records have no provider id	Ingest and resolve the CRM source first
`422` on create	`config_json` missing `provider_id_field` or `source_id`	Add both keys
`400` on `reverse-run`	Called without `confirm: true`	Dry-run first, then send `{ "confirm": true }`
Per-record error "property values were not valid"	An enum/read-only property in the payload	Restrict `config_json.fields` to free-text columns
Per-record error 401	Source credential lacks write scope	Re-issue the token with write scope and re-store it

Was this page helpful?

Edit this page on GitHub

PreviousStreaming vs batch entity resolution NextDoc components