Data infrastructure for AI Apache 2.0

Pinned Tweet
Introducing Chroma Context-1, a 20B parameter search agent. > pushes the pareto frontier of agentic search > order of magnitude faster > order of magnitude cheaper > Apache 2.0, open-source
145
403
4,178
1,111,197
Introducing our latest technical report: Context Rot - How Increasing Input Tokens Impacts LLM Performance Our results reveal that models do not use their context uniformly. full report in replies
39
90
898
185,384
Introducing Chroma Cloud: an open-source serverless search database that is fast, cost-effective, scalable, and reliable.
30
72
597
127,648
Chroma is now 4x faster, powered by Rust. trychroma.com/project/1.0.0
8
30
305
45,780
We introduce representative generative benchmarking—custom eval sets built from your own data that reflect real user queries. thank you @wandb for collaborating! link to report in replies
12
21
287
74,482
Introducing Chroma Package Search Enable your AI agents to search the source code of your package dependencies Includes 3 MCP tools: package_search_hybrid() package_search_grep() package_search_read_file() Add it to Cursor, Claude Code, VS Code, and Codex in 5 seconds
17
26
222
41,832
A deep dive on wal3 - Chroma's open-source write-ahead log built on object storage. Featuring: - a 30-year-old lock-free algorithm - Amazon S3's newest conditional writes feature - a novel checksumming technique called setsum
7
26
191
32,167
Announcing the Chroma CLI! - Browse data in the terminal - Copy data between chroma servers - Install sample apps - Create databases - Login to Chroma Cloud - **more coming soon** pip install -U chromadb npm install -g chromadb
6
5
127
15,661
We all have access to the same models, what coding agent you use matters. For example, @FactoryAI with Claude Sonnet 4 can out perform Claude Code with Claude Opus 4. Learn about how Chroma Cloud plays a role in Factory's context engineering strategy: trychroma.com/company/factor…
4
119
8,919
Prompting isn’t engineering, argues @isaacbmiller1. “Context engineering is how we actually build reliable AI systems.” Learn how @DSPyOSS turns prompts into programs.
4
14
94
22,660
NEW: @0interestrates on building @juliusai also on YouTube and Apple Podcasts full transcript and timestamps: studios.trychroma.com/rahul-…
2
4
67
35,353
yes
are you watching closely ?
4
1
58
23,478
Chroma Collections now support regex search. No extra tooling or configuration to set. Regex optimized indexes prevent full scans in most scenarios. Available now on Chroma Cloud, link below.
4
5
59
11,474
We can’t call it “engineering” if we can’t predict when it breaks. Context engineering turns vibes-based prompting into systems thinking. Chroma CTO, Hammad Bashir shares his thoughts on Context Engineering.
3
7
58
9,488
Notes on organizing our ChatGPT plugins hackathon trychroma.com/blog/hackathon
4
4
46
31,379
Introducing ChromaSwift - in beta! Build search and retrieval into your iOS apps - Includes on-device persistence - Packaged with on-device MLX embedding inference github.com/chroma-core/chrom…
5
6
51
7,580
We're building a horizontally scalable, cloud-native distributed system, designed from the ground up to power retrieval for AI application workloads. Here's how. (Link in next post)
2
3
46
20,095
Chroma's distributed and serverless architecture is: - open-source - written in Rust - object-storage native - separation of storage & compute, read & write, data & control plane - multi-tenant built by a core team of 8 engineers (now 15) careers at trychroma dot com
2
3
49
9,454
Chroma engineer Sanket Kedia introduces two new vector indexing methods now live on Chroma Cloud: SPANN and SPFresh.
1
1
46
14,927
Building slide decks automatically using AI: Alex Shevchenko from Ramp shares how they use story design and tournament selection to make great decks.
1
2
43
11,159
💫
I'm thrilled to announce Chroma's $18M seed round led by Quiet Capital We are excited to continue to invest deeply in the project and make it a ubiquitous open source standard trychroma.com/blog/seed
1
2
37
19,416
Chroma Cloud abstracts away all the complexity of the database so you can focus on your application and your users. No VMs to manage or configurations to tune. Watch the demo.
3
3
39
3,857
Go behind the research with @kellyhongsn and @atroyn. Our latest technical report: "Context Rot" investigates how model performance grows increasingly unreliable as input length grows.
4
3
37
6,318
Chroma 🤝 Code Search In the first video of a three-part series on code search, we walk through chunking strategies using tree-sitter.
1
1
36
4,987
We are offering a significant bounty to any AI coding assistant which can meaningfully improve our engineer's productivity in Rust. We have been frustrated by what's out there today, as even frontier reasoning models seem to fail at common tasks.
chroma is offering a [large] bounty for any ai coding assistant that knows enough rust to make a meaningful improvement to engineering productivity, as evaluated by our systems engineers. everything we've tried sucks. o1 thought about lifetimes for 9 minuts and got it wrong.
3
3
30
5,819
This is how we built a billing system to track billions of Chroma collections, fast. link in reply
2
1
31
5,379
come meet the @trychroma team!
.@vllm_project and Ollama are hosting an inference night at @ycombinator San Francisco. ❤️ Let's go open source! Come meet: vLLM project leads (@simon_mo_ and @woosuk_k) Ollama maintainers startup founders / engineers RSVP required 👇👇👇
1
24
8,285
NEW: @Altimor on building @getlindy also available on Youtube and Apple Podcasts full transcript and timestamps studios.trychroma.com/flo-on…
2
6
31
16,277
Chroma Cloud is backed by Distributed Chroma, an Apache 2.0 project. Learn about the lifecycle of read and write requests, and how Chroma delivers low latency search on top of object storage.
1
30
4,212
we 🫶 Ramp
The days of building slide decks from scratch will soon be over. Our very own @shevchenkoaalex did a talk with @trychroma on a new AI Presentation Co-Pilot his team is cooking, which generates slides by pulling brand guidelines, data, and research, then curates the best outputs.
1
3
30
5,599
Context Engineering SF - Aug 20, 6pm hosted by @trychroma, @ycombinator and @mastra join founders and engineers working at the frontier of Applied AI capabilities apply to attend in the Luma link in the replies
2
2
30
7,555
Over the weekend we crossed 20,000 stars and usage in 80,000 repos on GitHub! Our team is humbled and we’re excited to continue to build.
1
3
27
7,395
it was wonderful to have @bryanhelmig, CTO, on to discuss building AI products at @zapier lots of discussion on the untapped potential of AI in business workflows and building evals
2
7
25
12,860
Our research by @kellyhongsn on context rot, how larger context windows can degrade model performance, is cited in Anthropic’s latest on context engineering. 👀 anthropic.com/engineering/ef…
1
2
29
4,640
We recently published a rewrite of our JavaScript client focused on decreasing bundle size and improving performance, Itai walks us through what’s new.
1
27
3,426
Today, we’re launching: - Support for sparse vectors including BM25 and SPLADE - A new Search() API with support for hybrid search, including reciprocal rank fusion or your own custom weighting
1
1
29
11,275
We are committed to building Chroma as a ubiquitous open source standard for search and retrieval. Chroma is downloaded over 5 million times a month. Join our community. github.com/chroma-core/chrom…
1
25
2,479
Excited to support LangChain users!
Really excited to announce a new vector store integration: @trychroma If you wanted to develop easily with local vector stores, but ran into issues with FAISS... you should try this Blog Post: blog.langchain.dev/langchain… Examples Repo: github.com/hwchase17/chroma-…
3
25
11,835
chroma now available in @gpt_index!
chroma is now available in @gpt_index! chroma is the easiest way to store and work with embeddings and documents in your a.i-enabled app. working together with great tools like gpt index / llamaindex paves the way to building the modern a.i stack.
2
25
15,774
Watch this conversation between Kalan Chan from @Sourcegraph and @atroyn on how they built Cody
1
1
25
2,470
Watch Anton speak on the future of retrieval (exact timestamp) piped.video/DY3sT4yIezs?t=711 thanks @hwchase17 @langchain for hosting
2
4
22
6,720
Chroma now can store each collection’s embedding model, provider, and index settings, so clients, and the Cloud console, pick the right config automatically. You can use Collection Configuration in Chroma v1.0.10 or Chroma Cloud today.
1
2
23
3,181
chroma NYC underway!
2
22
4,837
🫡
Chroma just crossed 1M downloads per month and is #1 in python, #1 OSS in JS, and now #1 in LangChain we have many exciting things planned for 2024, come help us build it in SF! - distributed systems - product engineering - design (visual and product) - developer relations nitter.app/LangChainAI/stat… notion.so/trychroma/careers-…
20
2,692
chroma cloud is coming
average 'rag' setup
2
1
19
3,738
awesome collaboration with Baseten running embedding model inference is such a common need we hear from customers and Baseten makes it fast and easy
You can now use embedding models on Baseten as part of @trychroma's Python SDK! Check out the guide by @philip_kiely for the step-by-step on how to use the integration (you can embed and do inference on your data in minutes).
4
18
2,859
congrats @amitadkri on the winning the grand prize!
1
20
5,864
"just ask the AI" to write your search query for you LLMs as the query builder
🌟ChromaDB Self-Querying Retriever🌟 Last week we introduced the self-querying retriever Basic idea is to use an LLM to turn a user query into a "query" and a "filter" We now implemented to work with @trychroma! Docs: github.com/hwchase17/langcha…
1
3
17
10,823
1
2
18
1,986
we are at CalHacks, come say hello!
2
19
5,087
👋 hi Github
18
1,800
Chroma at Cal Hacks!
2
16
2,282
Collection forking is now available on Chroma Cloud! Call fork() on a collection to instantly get access to a fork, accruing storage costs only on new data, enabled by Chroma's object store backed storage engine. Use it for dataset versioning, checkpointing, or syncing coding agents with git branches. Learn more in the docs, link below.
1
1
18
2,808
.@alexgraveley came to the studio to talk about building Copilot and @ai_minion excellent advice here - particularly on fine-tuning and dataset formation
2
17
7,085
Chroma Cloud is live in production with thousands of users, from leading AI agents to fast growing startups.
1
16
1,586
Join @kellyhongsn and @atroyn for this event next Tuesday! They will share surprising insights that didn't make it into the report lu.ma/vw17piwl
Introducing our latest technical report: Context Rot - How Increasing Input Tokens Impacts LLM Performance Our results reveal that models do not use their context uniformly. full report in replies
1
17
18,436
Announcing Chroma Sync: automatically chunk, embed and index external data sources in Chroma Cloud. Starting with GitHub repositories.
1
1
16
10,015
really interesting survey from retool this morning AI application development is still early - as a community, we need to evangelize great developer workflows retool.com/reports/state-of-…
1
2
16
1,892
office warming last week
17
5,150
AI applications need many small indexes. Chroma Cloud calls these “Collections”, and can horizontally scale to billions. Leverage full-text, metadata, vector, and regex search. Use copy-on-write forking to cheaply version your data.
1
16
1,691
thanks to @tnm and @runcased for the first-day support for cloud! this project is awesome- check it out!
cased-kit 1.9 is out with first-class/first-day support for @trychroma's new Chroma Cloud. - use it for vector search - use it for docstring summarization - same api as kit's existing default local chroma support
16
2,011
Only pay for what you use: $0 / month + $5 in free credits $2.50 / GiB written $0.33 / GiB / mo stored $0.0075 / TiB queried + $0.09 GiB / returned trychroma.com/pricing
1
15
1,365
Thanks to everyone who made it to our Agentic Search event last night! Excited to share our work in this space soon.
Talent density at the @trychroma event last night was very high: - @EnoReyes from Factory - @arafatkatze from Cline - @JimmyAustin from Replit - @skeptrune from Mintlify/Trieve - @jeffreyhuber, @kellyhongsn, & Drew from Chroma - Me? 🫣 Lots of smart people talking about search
15
2,293
Powered by Chroma. Congratulations on the Series B @FactoryAI! Factory's Droids use Chroma Cloud's serverless semantic and regex search to produce some of the highest scores on agentic coding benchmarks.
The best agents for software development are becoming the best agents for everything. Droids are the best software development agents in the world, reaching #1 on Terminal-Bench. We have raised $50M from NEA, Sequoia Capital, J.P. Morgan, Nvidia, Abstract Ventures, and other industry leaders including Frank Slootman, Nikesh Arora, and Aaron Levie. Today, Droids are available to anyone, with any model, in any interface: CLI, IDE, Slack, Linear, Browser.
15
1,906
Announcing Chroma Web Sync! In addition to GitHub repos, Chroma Sync now supports web pages.
2
14
6,396
please join!
🌲Advanced Retrieval Webinar @langchain x @trychroma x @UnstructuredIO Retrieval is a key part of most GenAI systems. And there is a lot of nuance to it! Excited to bring together some leaders from across the stack to discuss it live next week! 👇 crowdcast.io/c/kqz7nl8nps42
1
14
5,215
great work @ankrgyl and team! check out the example for how to use with Chroma braintrustdata.com/docs/exam…
I'm excited to announce a new product I've been working on called @braintrustdata. Braintrust helps innovative companies and developers ship higher quality AI products by making it easy to run evals. 🔉 on
2
13
4,296
from our interview with @jacob_heller from @casetext > the best prompt engineers are former lawyers full interview coming soon!
1
1
13
2,932
We recently added ID filtering at query time, unlocking the ability to rank structured data by semantic similarity. Jai shows us how it works 👇
2
13
1,668
More and more vertical-specific LLM and embedding models are being released providing best-in-class retrieval and product experiences -- this release from @ScienceDotIO is incredibly exciting. congrats @WillManidis and team! 🎉
thrilled to announce our collaboration with @TryChroma to bring the Modern AI Stack to Healthcare. some of healthcare's largest organizations are already using ScienceIO with Chroma to make sense of their data and build llm applications here's why: trychroma.com/blog/scienceio
1
14
4,609
Chroma Package Search is built on Chroma Cloud, we partnered with @modal to power our server side embedding service.
14
1,592
This is the number of OSS Github repos that use Chroma
12
2,436
👋 hello world!
announcing the chroma embedding database. it's the easiest and best way to work with embeddings in your a.i. app. trychroma.com
1
12
6,634
our aim is to accelerate the useful and creative applications of ai here is our first technical report
today i'm pleased to share the first of a series of technical reports with the ai application developer community - our investigation into the use of linear embedding adapters in improving retrieval accuracy in realistic settings. @SuvanshSanjeev @trychroma
1
2
12
4,777
we dont know why we are trending! (3 days in row now)
📛 chroma 🧠 Chroma is an open-source database for quick Python and JavaScript apps, facilitating document management and natural language queries. 🛠️ @trychroma 💻 Rust ⭐ 16631 🍴 1382 🔗 github.com/chroma-core/chrom…
3
13
3,465
celebrating 🎉 - 2M monthly downloads - 20M all time downloads - 15k Github stars
1
3
13
1,708
nice build @buzzillio 🖖
the first 'chroma native' product just launched, only a few days after we announced chroma! amazing! ingestai.io lets you easily build chatGPT bots in your favorite social and messaging apps within minutes, without writing code
1
13
19,078
Our new Schema() api makes collection configuration simple.
1
1
13
1,019
Run open source embeddings models in Javascript! really excited about this integration- demo coming soon
Transformers.js v2.3.0 is here! 😍🚀 What's new? 🤗 Improved Hub integration and model discoverability 🔀 Vector DBs: Integration with @trychroma 📈 More-accurate quantized whisper models Full release notes: github.com/xenova/transforme…
1
4
13
4,224
🚀 In part 2 of our code search series, we show how to index an entire repo and efficiently update the index as code changes, using Chroma Cloud’s forking.
3
13
3,870
chroma support for @AnthropicAI MCP with lots of examples thanks @jairad2001! docs.trychroma.com/integrati…
12
2,091
come listen to our talk on (we are dead serious)
AI Dev 25 is just one week away! We want to take a moment and thank our sponsors and partners for making this event possible and for being active contributors to the AI developer community. They’re bringing hands-on workshops, live demos, expert talks, and cutting-edge AI tools to make this a day packed with learning and collaboration. We’re almost there—see you on March 14 in San Francisco! 🎟️ @Meta, @GoogleCloud, @NebiusAI, @AWScloud, @trychroma, @IBMResearch, @AMD, @Replit, @LaminiAI, @crewAIInc, @arizeai, @MongoDB, @Haystack_AI, @Qualcomm, @GroqInc, @SnowflakeDB, @windsurf_ai, @codeiumdev, @langchain, @OpenAI, @NVIDIA, @LandingAI, @neo4j, @IntelAI, @vectara, @pinecone, @spice_ai, @qdrant_engine, @griptapeai, @apify
1
9
2,680
Cool demo from @tensorlake featuring Chroma
Citations from LLMs typically reference file names or links to documents. We present an algorithm to provide users with line-level citation, or point to a figure or table cell in responses from LLMs. This can be added to any of your existing RAG algorithms. 🧵
4
13
3,528
join Chroma We’re looking for curious people who are dedicated to becoming world-class at their craft to join our team. There is a lot of important work to do. Join us. careers.trychroma.com/
2
12
3,580
chroma is for the builders
Replying to @atroyn
chroma is for the builders.
11
2,354
hello - chroma is hiring. (see thread)
hello, chroma is hiring. since july we've doubled our engineering team, and continue to grow. our team is small, formidable, high-performing, and high-trust. we are about to hit significant milestones. now is a fantastic time to join. these are some priority roles:
11
1,610
Retrieval workloads for AI applications differ significantly from traditional search workloads. Everything must be designed with the requirements of AI applications in mind. Read more: trychroma.com/engineering/se…
1
10
759
fuzzy search is so useful here! ( powered by @trychroma 🫡 )
Check out my latest update on the @huggingface Hub Semantic Search Space! ✨ Explore: - Semantic search through 100k+ models & datasets - Instantly discover similar resources with one click - AI-generated TL;DR of trending ML content
1
11
5,042
Chroma Cloud has native support for SPLADE embeddings. SPLADE expands queries and documents into sparse semantic vectors, matching on related terms instead of just exact keywords. Below, SPLADE correctly identifies the top relevant document, BM25 gets distracted.
2
11
1,253
Replying to @Allan_Ryan_
100% human intelligence
1
11
3,294