Homebrew Research reposted this
Llama3.1 just got ears 🦙👂 We're teaching Llama3.1 to listen - an open, ongoing experiment with AI at Meta's llama.3. We're excited to share llama3s v0.2, our latest multimodal checkpoint with improved speech understanding. Llama3s v0.2 consistently performs across multiple Speech Understanding benchmarks. While more analysis is needed, we're excited to share this progress with the community and get feedback. - Try the demo on Hugging Face: https://lnkd.in/grVVGNPD - Build it from scratch here: https://lnkd.in/g_eB7hkv For this round, please ask questions in English and keep audio under 10 seconds (due to current 500 token limit on audio prompts). Huge thanks to OpenSLR, PyTorch's Torchtune, and the teams behind multimodal architectures, Whisper, Collabora's WhisperSpeech, AudioBench, and Meta's Chameleon. Your work on warmup mechanisms was crucial 🩵 Special shoutout to contributors on our Discord and r/LocalLLaMA ❤️ Architecture, training, results, and next steps on our blog post: https://lnkd.in/g3JY_SGS