May 20, 2021 · In this work, we propose a novel architecture to efficiently harness the dual-side sparsity (ie, weight and activation sparsity).
Nov 25, 2021 · In this work, we propose a novel architecture to efficiently harness the dual-side sparsity (ie, weight and activation sparsity).
Our evaluation shows that our design can fully unleash the dual- side DNN sparsity and improve the performance by up to one order of magnitude with small ...
We introduce a novel architectural approach aimed at effectively leveraging dual-side sparsity, encompassing both weight and activation sparsity.
Abstract—Leveraging sparsity in deep neural network (DNN) models holds significant promise for accelerating model inference.
People also ask
What are sparse tensors?
Which GPU has tensor core?
What is the difference between core and tensor core?
How much faster are tensor cores than CUDA cores?
In this work, we propose a novel architecture to efficiently harness the dual-side sparsity (ie, weight and activation sparsity).
In this work, we propose an algorithm and hardware co-design methodology to accelerate the sparse neural networks.
Oct 22, 2024 · DSTC: Dual-Side Sparsity Tensor Core ... Researchers propose sparse attention to convert some DDMM operations to SDDMM and SpMM operations.
May 20, 2021 · This work proposes a novel architecture to efficiently harness the dual-side sparsity (i.e., weight and activation sparsity) and proposed a ...
Dec 19, 2023 · TSTC points out that the unstructured sparsity which enables algorithm flexibility can be maintained at the coarse-grained level, while hardware ...