"When implementations of the Transformer's self-attention layer utilize SRAM instead of DRAM, they can achieve significant speedups.The Tenstorrent Grayskull architecture provides a large SRAM, distributed across a grid of cores. This work presents a fused kernel for Grayskull, that exclusively utilizes its large SRAM by combining matrix multiplication, attention score scaling and Softmax operations. Additionally, a dedicated Softmax kernel utilizing the SRAM and a CPU implementation serving as a baseline are presented. The Softmax operation consumes most of the runtime in the computation of attention weights from queries and keys on Grayskull. The speedup of the dedicated Softmax kernel compared to the CPU implementation is up to 10×, and the Softmax implementation inside the fused kernel is approximately 1.8× faster than the dedicated Softmax kernel. The time and memory complexity of all implementations is quadratic in sequence length. Currently, the Grayskull e150 is approximately 30× cheaper for the general public than an Nvidia H100 PCIe (a state-of-the-art GPU) and offers approximately 1.5× more SRAM." Thanks Moritz Thüning, read the full paper here --> https://lnkd.in/gc7qi2hy
Tenstorrent
Computer Hardware Manufacturing
Toronto, ON 40,853 followers
Building computers for AI.
About us
Tenstorrent is a next-generation computing company that builds computers for AI. Headquartered in the U.S. with offices in Austin, Texas, and Silicon Valley, and global offices in Toronto, Belgrade, Seoul, Tokyo, and Bangalore, Tenstorrent brings together experts in the field of computer architecture, ASIC design, RISC-V technology, advanced systems, and neural network compilers. Tenstorrent is backed by Eclipse Ventures and Real Ventures, Archerman Capital, Samsung Catalyst Fund, and Hyundai Motor Group among others. Join us: www.tenstorrent.com/careers.
- Website
-
https://www.tenstorrent.com
External link for Tenstorrent
- Industry
- Computer Hardware Manufacturing
- Company size
- 201-500 employees
- Headquarters
- Toronto, ON
- Type
- Privately Held
- Founded
- 2016
- Specialties
- Deep learning hardware
Locations
Employees at Tenstorrent
Updates
-
Photos from Tenstorrent's 'Intro to Buda' Hackathon, organized by Sebastian Phemister. Teams created an internal help chat bot, text-to-speech systems, speech-to-text applications, visual question answering systems, and image classification models. Find out more about TT-Buda stack on our GitHub --> https://lnkd.in/eTmcgAUS
-
Tenstorrent fellows Jasmina Vasiljevic and Davor Capalija will be talking about Blackhole and our #opensource TT-Metalium stack at @hotchipsorg. Find out more --> bit.ly/tt_hot_chips_24
-
Today we are pleased to announce the release of our #riscv Architectural Compatibility Suite, now available in our GitHub repository. GitHub found here. This is arguably the most comprehensive open-source RISC-V compatibility suite with over 13,000 tests that cover a broad spectrum of RISC-V ISA. Read more here -->
Tenstorrent is Continuing its Contributions to the RISC-V Open Source Ecosystem
tenstorrent.com
-
Tenstorrent’s strategy is to attract AI developers by going beyond better hardware to offer an easy-to-use open-source software stack." Talking the importance of open source in automotive solutions with The Ojo-Yoshida Report
With Nvidia, It’s Always Take It or Leave It
https://ojoyoshidareport.com
-
👋 Cambridge, come join us at The Varsity Hotel to learn more about #tenstorrent and #riscv. Sign up here --> https://lnkd.in/eayr6SJx
-
"Tenstorrent’s silver lining is the scalable architecture of its AI silicon solutions. Its approach is to combine graph computer-based AI hardware with RISC-V compute cores. The #riscv cores afford a processing solution with built-in flexibility to support future models." - Thanks The Ojo-Yoshida Report
Tenstorrent’s Not-So-Secret AI Plan: ‘Don’t Compete with Nvidia’
https://ojoyoshidareport.com
-
Congratulations to Tenstorrent Senior Principal Architect Ken Dockser for winning the #RISCV Board of Directors Technical and Software award for continued efforts in driving innovation across the RISC-V ecosystem!
-
We are happy to announce that we have brought up support for Llama-3.1-70B inference on Tenstorrent’s 8-chip systems, the TT-QuietBox and the TT-LoudBox. The source code for Llama-3.1-70B and other models that are supported is on our GitHub ---> https://lnkd.in/e7vXJYX3
Llama-3.1 Announcement
tenstorrent.com