Apple just dropped a killer open-source visualization tool for embeddings — Embedding Atlas — and it’s surprisingly powerful for anyone working with large text+metadata datasets.
This reminds me of Nomic's Atlas, but I never got around to using it 😅
We’re talking real-time search, multi-million point rendering, and automatic clustering with labels.
One of their showcase examples visualizes ~200K wine reviews using embeddings + metadata like price, country, and tasting notes. And it is lightning fast even on my browser! No separate code needed!
It nails what most LLM devs need but often hack together:
✅ UMAP projections
✅ Faceted search across metadata (e.g. “country vs. price”)
✅ Hover + tooltip on raw points
✅ Interactive filters, histograms, and cluster overlays
✅ Cross-linked scatterplot + table views
Under the hood:
• Fast rendering using WebGPU (with WebGL fallback)
• Embedding-based semantic similarity search
• Kernel density contours for spotting clusters or outliers
You just upload your .jsonl or .csv with text + vector + metadata. It handles the rest: clustering, labeling, UI layout, everything.
This feels like the LLM-native version of Tableau — but optimized for text, chat and modern data needs
If you’re building RAG evals, search tuning, clustering explainability, or even dataset audits — this could be your new favorite tool.