CSV upload — the simplest source
Drag-and-drop a CSV into Golden Suite. The fastest path to seeing the workbench in action.
The simplest Golden Suite source. Drag a CSV onto the workbench, the auto-config infers schema, you're matching within 30 seconds. No credentials, no auth flow, no API limits.
When CSV upload is the right choice
- First-time exploration — sample data to test matching quality
- One-off cleanup jobs — dedupe a marketing list before a campaign
- Small datasets — under ~500k rows; bigger and the connectors below scale better
- Data not yet in a system — exported from a legacy app, scraped, hand-curated
- Testing schema mapping — sample 100 rows from a new source before wiring the full connector
How it works
/golden/sources→ Add source → CSV upload (or just drag onto the dropzone)- InferMap auto-detects the column types and proposes a mapping to the target schema
- Review the mapping in the autoconfig wizard
- Commit + dispatch — first golden records in seconds for small files
The CSV is uploaded directly to the backend, parsed with Polars, and stored as raw rows. Encoding is utf-8 lossy (handles latin-1 chars in legacy exports without crashing).
Sample CSV — what works
id,email,first_name,last_name,company,phone,created_at
1,sarah@acme.com,Sarah,Johnson,Acme Corp,+1-415-555-0100,2024-03-15
2,sarah.j@acmecorp.com,Sarah,Johnson-Smith,"Acme, Inc.",4155550100,2024-08-22
3,bob@example.com,Robert,Smith,Example LLC,+15555550200,2024-01-10
Things the auto-config will figure out:
- Column types (email = email pattern, phone = phone-like, dates = ISO timestamps)
- Which columns look like identifiers vs free-text
- Which columns are good blocking-signal candidates (email domain, name prefix)
- Which columns probably need standardization (phone has multiple formats above)
Things to watch for
Encoding
Latin-1 / Windows-1252 encoded files (common from legacy systems) parse via the utf-8 lossy path — invalid bytes get replaced with �. If you see garbled characters in the preview, re-export the source CSV as UTF-8 before uploading.
Header row
The first row is treated as headers. If your CSV doesn't have headers, add them — col1, col2, col3 is fine.
Free-text columns
The InferMap auto-config has heuristics for guessing column purpose. Free-text fields ("Notes", "Comments", "Description") usually get marked as low-signal and excluded from matching. That's usually right. If you have a free-text field that does carry identity signal, mark it explicitly in the autoconfig wizard.
Quoting
Standard CSV quoting works ("field, with comma"). Embedded newlines in quoted fields also work. If your file has non-standard quoting (curly quotes, missing closing quotes), normalize before uploading.
File size
The frontend dropzone caps uploads at 50 MB. Larger files: either split + upload in chunks, or use the Postgres/S3 connectors instead.
When NOT to use CSV upload
- Recurring ingest — every CSV upload creates a separate source row. For weekly/daily re-ingest, use a connector that supports incremental cursors (Postgres, Salesforce, Stripe, S3).
- Files over 50 MB — see size note above.
- Files with sensitive data uploaded over WiFi — the upload is HTTPS to bensevern.dev but if your security policy requires keeping data inside your VPC, use a connector to your warehouse instead.
Next steps
- /docs/getting-started/quickstart — the fastest path from sign-up to first golden record
- /docs/guides/use-case/customer-360 — where CSV uploads usually live in a real workflow
- /glossary/schema-inference — the magic behind auto-config