Connecting Postgres to Golden Suite

Set up the Postgres connector with read-only credentials and a custom SQL query.

The Postgres connector ingests rows from any table or query via standard libpq connection. Read-only credentials are strongly recommended; the connector never writes, but principle-of-least-privilege still applies.

What you need before starting

  • Postgres 12+ (the connector works with anything newer; older versions probably work but aren't tested)
  • A user with SELECT permission on the target table(s)
  • The DB reachable from Golden Suite's backend (public TCP, VPN-tunneled, or PrivateLink for Enterprise)
  • ~5 minutes

Setup

  1. In Golden Suite, go to /golden/sources → "Add source" → "Postgres."
  2. Test connection first — Golden Suite ships a "Test connection" button that validates the credentials before saving.
  3. Provide credentials:
    • Host (e.g., db.acme.com or dbprod.us-east-1.rds.amazonaws.com)
    • Port (default 5432)
    • Database name
    • Username + password
    • SSL mode (require for managed DBs; prefer for self-hosted)
  4. Provide the query. Either a simple SELECT * FROM customers or a JOIN query that flattens related tables.
  5. Pick the cursor column for incremental ingest — usually updated_at or an auto-incrementing id.

Sample queries

Simple table

SELECT id, email, first_name, last_name, phone, company, updated_at
FROM customers
WHERE deleted_at IS NULL;

Joined for richer features

SELECT
    c.id,
    c.email,
    c.first_name,
    c.last_name,
    c.phone,
    o.name AS company,
    o.industry,
    a.city,
    a.country,
    GREATEST(c.updated_at, COALESCE(o.updated_at, '1970-01-01'::timestamptz)) AS updated_at
FROM customers c
LEFT JOIN organizations o ON c.organization_id = o.id
LEFT JOIN addresses a ON c.primary_address_id = a.id
WHERE c.deleted_at IS NULL;

Incremental-only (uses the cursor)

The connector appends WHERE updated_at > $cursor to your query automatically when the cursor column is set. Make sure your query already includes updated_at (or whatever the cursor column is) in the SELECT — the cursor logic depends on it.

Schema inference

Once the connector pulls a sample, InferMap proposes the field mapping. Common Postgres column names (email, phone, first_name, last_name, created_at) auto-map to standard target fields. Anything ambiguous lands in the review state for manual mapping.

Common gotchas

  • Connection string vs separate fields. Some teams prefer pasting a connection string (postgres://user:pass@host:5432/db?sslmode=require); the UI supports that as an alternate input mode.
  • Read-only credentials, not superuser. Golden Suite never needs DDL or write access. Create a dedicated user: CREATE USER goldensuite_reader WITH PASSWORD '...'; GRANT CONNECT ON DATABASE prod TO goldensuite_reader; GRANT USAGE ON SCHEMA public TO goldensuite_reader; GRANT SELECT ON ALL TABLES IN SCHEMA public TO goldensuite_reader;
  • PrivateLink / VPC peering for production DBs. Public-internet access to a production DB is usually disallowed. Enterprise customers get PrivateLink wiring; talk to us at /enterprise.
  • Long-running queries time out. The connector caps individual queries at 5 minutes; if your query takes longer, optimize it (add indexes, narrow the WHERE clause, or split into multiple sources).
  • NULL handling. Postgres NULL becomes Python None becomes "missing" in the matching engine. That's usually right. If you have NULLs you want treated as a sentinel value, COALESCE in the query: COALESCE(email, '') AS email.
  • JSONB columns. Flatten them in the query — payload->>'email' AS email. Golden Suite doesn't auto-expand JSONB.
  • TIMESTAMP WITH TIME ZONE. Standard handling; the connector preserves tz info. If you have naive timestamps (TIMESTAMP WITHOUT TIME ZONE), the assumption is UTC.

SQL identifier safety

The connector's query input is regex-validated (_safe_pg_identifier()) against SQL-injection patterns. Custom WHERE clauses or ORDER BY hints go through the same validation. Tabular escapes are correct by default; you don't need to think about it.

Performance notes

  • The connector streams results in batches (default 1000 rows per page)
  • For tables over 1M rows, expect ~30-60 seconds per million on a typical DB
  • Incremental ingests only fetch rows where updated_at > last_cursor; this is usually <100 rows per run after initial sync
  • The cursor is persisted in golden_sources.cursor_value per source

Cost considerations

  • Golden Suite side: 1 source per Postgres connection (Free: 3, Pro: 25). If you ingest from 5 different Postgres DBs, that's 5 sources.
  • Postgres side: read-replica recommended for production DBs. Hammering primary with regular ingest queries can impact OLTP performance.

Next steps