It’s been a year since we launched Chroma's open source embeddings store, making it easy to build retrieval into your AI application. Since then, over 7.3 million individual machines have run Chroma. The ecosystem has evolved, and we’ve learned a lot.
Over the last twelve months, our thesis that retrieval would be a fundamental component of AI application development has paid off. Retrieval is also becoming an increasingly important component in AI research.
Our investment in developer experience and usability, as well as our integrations with other open-source projects like LlamaIndex and LangChain, have earned our leading position in retrieval in Python, and in open source overall.
So, what's next? Everybody already knows Chroma's developer cloud is coming, and soon. At launch, it will have the best product, with the best developer experience, at the best price point. Yes, it's taking longer than expected - we are confident the greater investment is worth it.
Almost since launch, we've been saying that vector search alone isn't enough. The last year has shown more of what works and what doesn't in retrieval for AI. It was gratifying to see OpenAIDevs go into detail about what it takes to build retrieval that actually works at developer day.(https://lnkd.in/gBWgYmY3)
Automatic embedding model selection, and the dual problem of optimal chunking, result relevancy and ranking, and automatic fine-tuning of the retrieval system aren't optional extras, but critical components which we'll be building into our product.
Neural retrieval approaches, like @lateinteraction's CoLBERT, show a lot of promise - it's a natural extension of both retrieval and language models to use much more of the context of both query and document when retrieving.
Looking further ahead, the retrieval-AI loop ('RAG') as it's done today is very primitive. Stuffing the LLM's context window with data blindly retrieved from elsewhere, and mixing instruction, data, and other context together is clearly suboptimal.
Over the next 12 months, Chroma will be experimenting with architectures and approaches that more directly integrate the retrieval system directly with the execution LLM. There are a lot of promising directions, and open-weights models will help a lot.
It's very clear that it's still very early in AI. I often draw an analogy to the early web, or to the early days of aviation - the best thing people can be doing right now is experimenting. Chroma will continue to make it as easy as possible to experiment with AI.
LFG.