Introducing the Patronus API - leading AI evaluation models 🚀 - Beats ragas on RAG evaluation tasks - Beats Llama Guard and Perspective on safety tasks - LLM judges better than SOTA LLMs - Excels in practical domains like finance and customer support Hundreds of elite AI teams across companies like Hospitable.com, Exa, and Algomo use Patronus to do alpha evals. ⚡ We are also excited to launch the Patronus API with Day 1 integration partners like NVIDIA, MongoDB, IBM, Portkey, and Nomic AI. The best is yet to come 🚀 Try it out: https://app.patronus.ai/ Patronus Evaluator Benchmarking: https://lnkd.in/eDkvGK8F Read the VentureBeat coverage: https://lnkd.in/eA2kx8Uf Read how Hospitable.com, Exa, and Algomo use Patronus: Day 1 Integrations with NVIDIA, MongoDB, IBM, Portkey, Nomic AI: https://lnkd.in/etEgykMG 🚀
Patronus AI
Technology, Information and Internet
New York, New York 4,667 followers
Automated AI Evaluation and Security
About us
Patronus AI is the leading automated AI evaluation and security company. Our world-class platform enables enterprise development teams to score LLM performance, generate adversarial test cases, benchmark LLMs, and more. Customers use Patronus AI to detect LLM mistakes at scale and deploy AI products safely and confidently. Founded by machine learning experts from Meta AI and Meta Reality Labs, Patronus AI is on a mission to boost enterprise confidence in generative AI. We are backed by Lightspeed Venture Partners, Replit CEO Amjad Masad, Gokul Rajaram, and Fortune 500 executives and board members.
- Website
-
https://patronus.ai
External link for Patronus AI
- Industry
- Technology, Information and Internet
- Company size
- 11-50 employees
- Headquarters
- New York, New York
- Type
- Privately Held
- Founded
- 2023
Locations
-
Primary
New York, New York, US
Employees at Patronus AI
Updates
-
Today, we’re thrilled to share that 50 AI and data executives are investing in Patronus AI! 🚀 InvestInData is a group of technical leaders across companies like Coursera, Chime, Substack, Faire, JLL, and Amazon, and they have backed awesome AI startups like CometML, Baseten, and Nimble in the past. Our partnership with InvestInData not only validates our vision, but also strengthens our mission to empower AI practitioners with the most powerful tools to use AI safely and confidently. We are looking forward to closely partnering with industry executives whose expertise will help us scale to more customers around the world. Read more in our announcement here:
-
Patronus AI reposted this
Come hear me talk about AI and security tomorrow at 5pm PT during #SFTechWeek! I'm excited to discuss how enterprise leaders are approaching the intersection of LLM evals and security, and how we at Patronus AI are tackling these problems in unique ways. Thanks to Datadog and Auth0 by Okta for organizing the panel 😎 So excited for this conversation with Monica Bajaj from Okta and Eylul Kayin from Gradient! Event link: https://lnkd.in/ezSiSHHf
-
Patronus AI reposted this
Electric morning teaming up with Databricks and Patronus AI for AI bagels! If you are an AI researcher or engineer interested in future events, feel free to reach out☕🍩 Notable Capital, Dan Cahana, Jacob Portes
-
Llama Guard is Off Duty 😲 We benchmarked popular toxicity datasets spanning languages like Portuguese, Ukrainian, and Turkish, and found that Llama Guard has a very high false negative rate for toxic content! We found that base models like Llama 3.1 do all the heavy lifting on toxicity filtering, and that the joint usage of Llama Guard might be redundant. 🤔 At Patronus AI, we rigorously benchmark all things AI to help engineers trust what they use. Reach out to [email protected] to learn more! Llama Guard might be off duty today, but you don't have to be 🎯 — Read more in our blog post here: https://lnkd.in/eayCX4ct
Patronus AI | Llama Guard is Off Duty 😲
patronus.ai
-
Introducing Patronus AI + Portkey 🚀 Portkey is the leading open source AI gateway. It’s blazing fast and supports over 200+ LLMs. Developers around the world use Portkey to operationally manage their AI products more easily. There are lots of challenges to building and deploying LLM products to production: lots of LLMs to choose from, various frameworks to integrate, and costs are hard to track. But the biggest challenge of all is the lack of highly reliable LLM guardrails. Enter Patronus AI + Portkey 🚀 You can now use 10+ Patronus evaluators in Portkey, including Lynx, the best hallucination evaluator. ✨ Read the Portkey docs on how to get started: https://lnkd.in/gX8iYPjR Read our blog post: https://lnkd.in/gDvMRjE3 Check out Portkey on Github: https://lnkd.in/dqicscMy
-
Today, we are excited to release Lynx v1.1, a smaller, state of the art RAG hallucination detection model 🚀 Even though companies use RAG to reduce hallucinations, LLMs can still produce unsupported or contradictory information. Since we released Lynx v1.0 a few weeks ago, thousands of developers have used it in all kinds of real world applications. Lynx v1.1 is the best performing RAG hallucination detection model of its size, enabling real-time hallucination detection in AI applications ✨ - Beats Claude-3.5-Sonnet on HaluBench by 3% - Outperforms GPT-4o on medical questions and answers by 6.8% - 1.4% higher accuracy than Lynx v1.0 on HaluBench - Outperforms all open source models on LLM-as-judge tasks - Open source, open weights and open data Use Lynx 1.1 with any of our Day 1 integration partners like NVIDIA, MongoDB, and Nomic AI 🚀 Check out the Hugging Face Spaces demo: https://lnkd.in/gcjVmeNG Download Lynx v1.1 on Hugging Face: https://lnkd.in/gvMbRddM Download Lynx v1.1 (Quantized) on Hugging Face: https://lnkd.in/g3xAFJZh Read the arXiv paper: https://lnkd.in/eznVjrWA Read the blog: https://lnkd.in/eYaP5Zpe
-
Patronus AI reposted this
Patronus AI recognized the need for automated LLM evaluation to instill confidence in enterprises deploying #GenAI models. ▶️ Enter Lynx, a hallucination detection model that can use complex reasoning to identify conflicting outputs. See how the team trained Lynx using LLM Foundry, Mosaic Composer, and the Mosaic Platform👇 https://dbricks.co/3y11unP
-
Patronus AI reposted this
Hallucinations degrade LLM outputs—and user trust. Enter Lynx: Patronus AI's state-of-the-art open-source hallucination detection model. When integrated with MongoDB, it works in real-time, providing immediate feedback on the faithfulness of #AI responses without manual annotations. In this blog, we provide step-by-step instructions on how to use open-source models with MongoDB Atlas and catch hallucinations using the local Lynx API. https://lnkd.in/dgCnb_nV