Contextual AI reposted this
Douwe Kiela's (CEO, Contextual AI) contributed foundational parts to the AI ecosystem. He co-wrote the first paper on RAG and has raised over $100 million to help enterprises build contextual language models that fit their use cases. Before Contextual he was the head of research at Hugging Face, and worked on the Facebook AI research team. He remains an adjunct professor at Stanford. On Unsupervised Learning, Douwe was incredibly open with his take on AI’s recent history and where he thinks it’s going. Some takeaways: 🤝 Alignment is exciting Douwe is most excited about working on alignment – or making sure AI does what it's intended to do. One of the questions he asks himself is, ”how do we make systems maximally useful for the end users?” There are a lot of approaches, but one he’s excited about is Anchored Preference Optimization, or APO. It’s a way to cut out a lot of manual data tagging and make sure the model is using training data in the most optimal way. 🏗️ Infrastructure is fickle and hard “When you’re not in a startup you think the infrastructure things are easy…turns out having a high end research cluster that actually works is incredibly hard,” Douwe says. The latest Llama paper had stats about the “amazing” number of hardware failures they had with the need to swap out GPUs or even entire nodes. 🧑💻Douwe’s Take on Open source vs Closed Douwe’s bias is slightly towards open source. He thinks of models as a kind of triangle. The frontier models are at the apex – interesting but the most expensive. The bottom is full open source, where anyone can do anything. “The most interesting part is the middle of the triangle, where you get the right trade offs… if you start from open source, and then you have an amazing post-training capability, you can end up in the sweet spot.” 👏 Reaction to o1 Douwe’s reaction to the o1 model highlights a significant shift in AI from focusing solely on models to embracing broader system-level thinking. He explains how the o1 model compresses chains of reasoning, creating a more sophisticated system that enhances reasoning capabilities. Douwe finds this approach encouraging, as it aligns with the work his team is doing, particularly around retrieval mechanisms. He notes that while this model's system-centric approach is powerful, its future adoption will depend on deployment needs, particularly latency considerations during test time. Check out the full conversation below: YouTube: https://lnkd.in/g4YhH2ny Spotify: https://spoti.fi/4daJxl5 Apple: https://apple.co/4d8ndIR