Last Wednesday, we introduced Moshi, the lowest latency conversational AI ever released. Moshi can perform small talk, explain various concepts, engage in roleplay in many emotions and speaking styles. Talk to Moshi at https://moshi.chat/ and learn more about the method below: Moshi is an audio language model that can listen and speak continuously, with no need for explicitly modelling speaker turns or interruptions. When talking to Moshi, you will notice that the UI displays a transcript of its speech. This does *not* come from an ASR nor is an input to a TTS, but is rather part of the integrated multimodal modelling of Moshi. Moshi is not an assistant, but rather a prototype for advancing real-time interaction with machines. It can chit-chat, discuss facts and make recommendations, but a more groundbreaking ability is its expressivity and spontaneity that allow for engaging into fun roleplay. Developing Moshi required significant contributions to audio codecs, multimodal LLMs, multimodal instruction-tuning and much more. We believe the main impact of the project will be sharing all Moshi’s secrets with the upcoming paper and open-source of the model. For now, you can experiment with Moshi with our online demo. The development of Moshi is more active than ever, and we will rollout frequent updates to address your feedback. This is just the beginning, let's improve it together.
Kyutai
Technology, Information and Internet
Build and democratize Artificial General Intelligence through open science.
About us
- Website
-
https://kyutai.org/
External link for Kyutai
- Industry
- Technology, Information and Internet
- Company size
- 2-10 employees
- Type
- Nonprofit
Employees at Kyutai
Updates
-
So happy to have revealed moshi, our new voice AI earlier today. If you miss it, you can see the keynote here: https://lnkd.in/d_tZWdNv And try out the model at https://lnkd.in/epAb-EeZ or https://lnkd.in/esRx5Gkw for US based users that want better latencies.
Unveiling of Moshi: the first voice-enabled AI openly accessible to all.
https://www.youtube.com/
-
Join us live tomorrow at 2:30pm CET for some exciting updates on our research! https://lnkd.in/ecT4biG2
-
Kyutai reposted this
🎥 Flashback to ai-PULSE, the biggest European #AI event! 🙌 Last Friday thousands of you joined us at STATION F, either in person or remotely, to hear the latest announcements by the Groupe iliad and Scaleway at Europe’s premier AI conference. 🚀 With €300 million already invested in it, Kyutai – the research lab initiated by Xavier Niel, Rodolphe SAADE and Eric Schmidt – will pave the way for building the future of generative AI. With Scaleway’s computing power and some of the world’s most renowned researchers, Kyutai will benefit Europe’s entire AI ecosystem. Aude Durand, Damien Lucas, Thomas Reynaud, Nicolas Jaeger, Jensen Huang, Alexandre Défossez, Edouard Grave, Hervé Jegou, Laurent Mazare, Patrick Pérez, Neil Zeghidour, Yejin Choi, Yann LeCun, Bernhard Schölkopf, Emmanuel Macron, Jean-Noël Barrot