Brev.dev reposted this
How do LLM inference optimizations work? I had this question so I sat down with the legend, Kyle Kranen to learn more about them. NVIDIA is full of experts on anything AI related, is there something you want me to ask? I can even mic up Kyle again 😂❤️🤙
Quite informative, deserves a proper detailed write-up, is it already available in any blogs ?
Interesting and informative!!! Definitely need more convo like this sir. Thanks for sharing Nader Khalil 👏
really informative , especially the part of optimizing the amortized complexity over many possible sequence lengths. more content like this please!
🥵 It’s as if a painter told you all the process, step by step of how he made the picture. Terrible. That’s why we prefer to swim us in the picture and ramble on about how he did it.
I'm curious about accounting for token consumption and whether the AI hallucinations can trigger excessive consumption or not?
Oh man, honestly this is the industry Is everything I want.
All the topics that AI related will be interesting to listen. So we stay tuned for the next episodes.
this is great! Wow. Great job to you both - really enjoyed this.
NVIDIA Eng TLM: AI Foundation Models
3moHad such a fun time filming this with you Nader Khalil 😊