Magnifico’s Post

Magnifico reposted this

View profile for Joel Caruso, graphic

Account Manager for Strategic Start Ups

NVIDIA AI TensorRT-LLM just went live on GitHub! 🔥   Let us know via the repo if there are any questions you have about the new toolchain. Models you would like supported, features you're looking for, bugs you run into, and we'd love to hear about your experience working with the toolchain after you've been able to run your experiments. TLDR: TensorRT-LLM is a opensource acceleration engine for LLM Models with support for Multi-GPU & Multi-node. Support for Inflight Batching, Quantization, mixed precision on NVIDIA #Ampere and #Hopper H100 #GPUs. Looking forward to hearing about your experience! TensorRT-LLM https://lnkd.in/gYiP4N_S Triton Inference Server Backend for TensorRT-LLM https://lnkd.in/garWpi3g TensorRT-LLM Documentation: https://lnkd.in/gTUr3cB7 The quantization toolkit for TensorRT-LLM Ammo: https://lnkd.in/gWbFYmTt

GitHub - NVIDIA/TensorRT-LLM: TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

GitHub - NVIDIA/TensorRT-LLM: TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

github.com

Joel Caruso

Account Manager for Strategic Start Ups

10mo
Like
Reply
Joel Caruso

Account Manager for Strategic Start Ups

10mo
Like
Reply
Joel Caruso

Account Manager for Strategic Start Ups

10mo
Like
Reply

Is there No support for CPU?

Like
Reply

So excited to learn it

Like
Reply
Mike Roberts

Account Executive - Strategic Accounts - AI in Simulation, Medical and Industrial markets

10mo

NVIDIA keeps rocking it

Sam Alim-Marvasti

Lead Architect @ MSAI Labs,| PhD in Machine Learning, Software and Cloud Architect

10mo

how does this work with existing pytorch LLMs that have TensorRT optimizations?

Like
Reply
Francy Lisboa

AI Agribusiness Consultant & Founder | Generative AI, Prompt Engineering

10mo

That's getting exciting...

Like
Reply
Suzanne Nurrito

Manager, Enterprise Sales Americas at NVIDIA- Artificial Intelligence | Machine Learning | Virtualization | Edge Computing | High-Performance Computing

10mo
See more comments

To view or add a comment, sign in

Explore topics