Knowledge
Explore the Golden Suite knowledge graph — 2D map, full-text search, and per-entity notes linked across repositories and concepts.
Knowledge
The Knowledge section surfaces everything Golden has learned from public data — open-source repositories, research papers, Kaggle datasets, benchmark results — organized as a browsable graph.
Three surfaces, one underlying graph:
| Surface | Path | Purpose |
|---|---|---|
| Map | /knowledge | 2D scatter of every entity, zoomable down to individual repos |
| Entity pages | /knowledge/[entityId] | Per-entity notes, k-NN neighbors, backlinks |
| Search | /knowledge?q=... | Hybrid full-text + semantic search across all entities |
Data model
Every node in the graph is an entity — a repository, paper, dataset, or concept with a stable ID. Entities carry:
- Notes — Obsidian-flavored Markdown that may contain wikilinks to other entities
- Embedding — 768-dim vector used for map layout and nearest-neighbor lookup
- Cluster assignment — produced by HDBSCAN at multiple zoom tiers for progressive disclosure
- Source URL — canonical external link (GitHub, arXiv, Kaggle, etc.)
Regenerating the map
The static 2D positions are baked into frontend/public/knowledge-map.json and rebuilt from backend embeddings with UMAP + HDBSCAN + k-NN.
uv run --with numpy --with umap-learn --with python-dotenv \
--with httpx --with "scikit-learn<1.6" --with "hdbscan==0.8.40" \
scripts/build-knowledge-map.py
Re-run whenever you ingest a new batch of entities. The map component uses the 4-tier cluster_levels[] array to cross-fade labels as the user zooms.
Related
- Repo audit guide — submit a new repo to enter the graph
- API: Entities — the backend proxy that feeds these surfaces