We’re thrilled to announce that we've raised a $36M Series A led by Martin Casado at Andreessen Horowitz to advance the future of AI software engineering, bringing our total funding to $45 million. Through our work with top AI engineering and product teams from Notion, Stripe, Vercel, Airtable, Instacart, Zapier, Coda, The Browser Company, and many others, we’ve had a front-row seat to what it takes to build world-class AI products. Along the way, we’ve learned a few key lessons: - Crafting effective prompts requires active iteration. - Evaluations are crucial for systematically improving quality over time. - Production logs provide a vital feedback loop, generating new data points that drive better evaluations. Evals are just the first step to building AI apps. That’s why we’re also excited to introduce functions, the flexible primitive for creating prompts, tools, and scorers that sync between your codebase and the Braintrust UI.
Braintrust
Software Development
Braintrust is the end-to-end platform for building AI applications
About us
Braintrust is the enterprise-grade stack for building AI products. From evaluations, to prompt playground, to data management, we take uncertainty and tedium out of incorporating AI into your business.
- Website
-
https://braintrustdata.com/
External link for Braintrust
- Industry
- Software Development
- Company size
- 11-50 employees
- Type
- Privately Held
- Founded
- 2023
Employees at Braintrust
Updates
-
Braintrust reposted this
AI Case Study: How do you reduce hallucinations by over 80%? Start with a robust evaluations framework. Here’s a look inside our latest project teaming up with Zapier to improve the reliability of their awesome AI-powered API integration builder. And shout out to Braintrust – our go-to evals tool for this project. Link in comments.
-
We’re excited to share that Braintrust has been selected by The Information as one of the 50 most promising AI startups of 2024! #TheInformation50
-
Braintrust reposted this
Every genAI project involves building evals to define and measure correctness, but it's rarely straightforward. Take our recent AI adtech project: we’re taking heaps of unstructured, inconsistent data and organizing it into a hierarchical taxonomy with LLMs (imagine a usable output like “Food, Beverages & Tobacco > Food Items > Meat, Seafood & Eggs > Meat”). At first glance, you might think, “Easy, just define correctness.” But the reality is far more complex — there are hundreds of thousands of categories, getting the top-level category right is much more critical than perfecting the 4th level of the taxonomy, there are many dimensions of correctness etc. So what did we do? Here’s how Joshua Marker and Dan Girellini built custom evals using Braintrust (code included). https://lnkd.in/gxT-4H-F
-
Great insights from Ankur Goyal on building AI products! https://lnkd.in/gwFamtr2
1 to 100: Braintrust
https://www.youtube.com/
-
Braintrust reposted this
We are excited to unveil the 2024 Intelligent Applications 40, highlighting this year’s most compelling private companies leveraging artificial intelligence and machine learning. We received over 380 nominations from more than 70 venture investors across 54 top-tier venture and corporate investment firms and continued our partnership with Pitchbook to incorporate its data-driven scoring into the voting process to further enhance the rigor and research behind our decision-making. At Madrona, we have spent over a decade partnering with and investing in intelligent application companies, observing their influence and impact across various verticals. These applications are poised to shape the future of software and the next wave of computing. We believe they deserve to be recognized and celebrated for their achievements! Read more insights and thoughts about this year's winners and the current AI landscape: https://lnkd.in/g2CjX_rj See the full list and methodology here: https://lnkd.in/gqy9z34A We're celebrating the #IA40 winners on Oct. 2 at the 2024 #IASummit in partnership with Microsoft, Amazon Web Services (AWS), Delta Air Lines, NYSE, Morgan Stanley, & McKinsey & Company. Request an invite here https://lnkd.in/gQ4bPc7S
-
Braintrust reposted this
Last week we had a great experience having an AI dinner hosted by Braintrust and Greylock. Having a round table with fellow AI practitioners was insightful. One of the topics of discussion was “fine-tuning” vs using RAG combined with foundational models available from external vendors. It is indeed interesting to see a lot of the industry move from fine-tuning to using external API’s as Claude, Gemini GPTs and LLAMA models have new and improved versions available. There is definitely a debate : Hemant Jain brought up — Is fine-tuning done for most use cases? Or will we see a reversal of #trends? Will Companies that have been moving away from finetuning towards using RAG’s with external models will go back to fine-tuning? In domains such as finance or healthcare where most of the data is not public having fine-tuned models might make sense. In domains where numerical accuracy is not the goal, such as summarization , small heavily fine-tuned models might lead to great cost savings. Of course all fine-tuning efforts require a constant source of high quality data which can be a challenge for a lot of companies. For a lot of general use cases in QNA, conversational search it could be possible to get the desired quality of content with few shot prompting , RAG or an agentic framework without fine-tuning. This saves time and gives the opportunity to experiment with multiple solutions to improve your product. It was great to hear Himanshu Gahlot & Debarag Banerjee talk about their experiences working with agents. Now there is no free lunch. Your ability to create a multitude of solutions for content generation also implies that you need to figure out how to how to evaluate each of solutions that you generate. You need a rubric that can measure high quality required for your product and then you need a supply of SME’s and automated evaluation techniques to identify which solutions should be surfaced to customers. Right now the practice of GenAI product building is more than a year old and we are at a point where #accountability in #ai generated content has become important more than ever before. Currently a lot of effort is going into creating a framework for evolving quality rubrics and automated evaluations. Braintrust spoke about their evaluation platform and encouraged a discussion on “fun” (or painful) evaluation issues faced by the room. My personal favorite is formatting issues, their never ending solutions such as function calling and none of them being full proof. So your content is not rendered correctly for at least a few percentage of the times. What new ideas do you have on improving/correcting formatting generated by your favorite pesky foundational model. #genai #roundtables #aidinner #rag #finetuning #evaluations
-
Honored to be named as one of the most promising startups of 2024 by Business Insider! Thank you Corinne Marie Riley Shravan N. for nominating us https://lnkd.in/g2hxzUjS
85 of the most promising startups of 2024, according to top VCs
businessinsider.com