skip to main content
research-article

FPGA-Based Sparse Matrix Multiplication Accelerators: From State-of-the-Art to Future Opportunities

Published: 18 November 2024 Publication History

Abstract

Sparse matrix multiplication (SpMM) plays a critical role in high-performance computing applications, such as deep learning, image processing, and physical simulation. Field-Programmable Gate Arrays (FPGAs), with their configurable hardware resources, can be tailored to accelerate SpMMs. There has been considerable research on deploying sparse matrix multipliers across various FPGA platforms. However, the FPGA-based design of sparse matrix multipliers still presents numerous challenges. Therefore, it is necessary to summarize and organize the current work to provide a reference for further research. This article first introduces the computational method of SpMM and categorizes the different challenges of FPGA deployment. Following this, we introduce and analyze a variety of state-of-the-art FPGA-based accelerators tailored for SpMMs. In addition, a comparative analysis of these accelerators is performed, examining metrics including compression rate, throughput, and resource utilization. Finally, we propose potential research directions and challenges for further study of FPGA-based SpMM accelerators.

References

[1]
Leon Adams and Strategic Marketing. 2002. Choosing the Right Architecture for Real-Time Signal Processing Designs. Texas Instruments, Dallas, TX, USA.
[2]
Ariful Azad, Aydin Buluç, and John Gilbert. 2015. Parallel triangle counting and enumeration using matrix algebra. In 2015 IEEE International Parallel and Distributed Processing Symposium Workshop. IEEE, Hyderabad, India, 804–811. DOI:
[3]
Aydi̇n Buluç and John R. Gilbert. 2011. The combinatorial BLAS: Design, implementation, and applications. The International Journal of High Performance Computing Applications 25, 4 (2011), 496–509. DOI:
[4]
Ruiqi Chen, Haoyang Zhang, Shun Li, Enhao Tang, Jun Yu, and Kun Wang. 2023a. Graph-OPU: A highly integrated FPGA-based overlay processor for graph neural networks. In 2023 33rd International Conference on Field-Programmable Logic and Applications (FPL). IEEE, Gothenburg, Sweden, 228–234. DOI:
[5]
Ruiqi Chen, Haoyang Zhang, Yuhanxiao Ma, Jianli Chen, Jun Yu, and Kun Wang. 2023b. eSSpMV: An embedded-FPGA-based hardware accelerator for symmetric sparse matrix-vector multiplication. In 2023 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, Monterey, CA, USA, 1–5. DOI:
[6]
Yuedan Chen, Guoqing Xiao, Fan Wu, Zhuo Tang, and Keqin Li. 2020b. tpSpMV: A two-phase large-scale sparse matrix-vector multiplication kernel for manycore architectures. Information Sciences 523 (2020), 279–295. DOI:
[7]
Yuedan Chen, Guoqing Xiao, and Wangdong Yang. 2020a. Optimizing partitioned CSR-based SpGEMM on the Sunway TaihuLight. Neural Computing and Applications 32, 10 (2020), 5571–5582. DOI:
[8]
Stephen Chou, Fredrik Kjolstad, and Saman Amarasinghe. 2020. Automatic generation of efficient sparse tensor format conversion routines. In 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, Copenhagen, Denmark, 823–838. DOI:
[9]
Jason Cong, Peng Wei, Cody Hao Yu, and Peng Zhang. 2018. Automated accelerator generation and optimization with composable, parallel and pipeline architecture. In 55th Annual Design Automation Conference (DAC). ACM, New York, NY, 1–6. DOI:
[10]
Paolo D’Alberto, Abhishek Jain, Ismail Bustany, Henri Fraisse, and Mansimran Benipal. 2023. Entropy maximization in sparse matrix by vector multiplication (\(ESpMV\)). arXiv:2308.00106, 1–26. Retrieved from https://doi.org/10.48550/arXiv.2308.00106
[11]
Timothy A. Davis and Yifan Hu. 2011. The University of Florida sparse matrix collection. ACM Transactions on Mathematical Software (TOMS) 38, 1, Article 1 (Dec 2011), 25 pages. DOI:
[12]
Mehmet Deveci, Simon D. Hammond, Michael M. Wolf, and Sivasankaran Rajamanickam. 2018. Sparse matrix-matrix multiplication on multilevel memory architectures: Algorithms and experiments. arXiv:1804.00695, 1–24. DOI:
[13]
Yixiao Du, Yuwei Hu, Zhongchun Zhou, and Zhiru Zhang. 2022. High-performance sparse linear algebra on HBM-equipped FPGAs using HLS: A case study on SpMV. In 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA). ACM, New York, NY, 54–64. DOI:
[14]
Iain S. Duff, Michael A. Heroux, and Roldan Pozo. 2002. An overview of the sparse basic linear algebra subprograms: The new standard from the BLAS technical forum. ACM Transactions on Mathematical Software (TOMS) 28, 2 (2002), 239–267. DOI:
[15]
James J. Elliott and Christopher M. Siefert. 2018. Low thread-count Gustavson: A multithreaded algorithm for sparse matrix-matrix multiplication using perfect hashing. In 2018 IEEE/ACM 9th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (scalA). IEEE, Dallas, TX, 57–64. DOI:
[16]
Yingxue Gao, Lei Gong, Chao Wang, Teng Wang, Xi Li, and Xuehai Zhou. 2023. Algorithm/hardware co-optimization for sparsity-aware SpMM acceleration of GNNs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 42, 12 (2023), 4763–4776. DOI:
[17]
Theodoros Gkountouvas, Vasileios Karakasis, Kornilios Kourtis, Georgios Goumas, and Nectarios Koziris. 2013. Improving the performance of the symmetric sparse matrix-vector multiplication in multicore. In IEEE 27th International Symposium on Parallel and Distributed Processing (IPDPS). IEEE, Cambridge, MA, 273–283. DOI:
[18]
Zhixiang Gu, Jose Moreira, David Edelsohn, and Ariful Azad. 2020. Bandwidth optimized parallel algorithms for sparse matrix-matrix multiplication using propagation blocking. In 32nd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). ACM, Virtual Event, USA, 293–303. DOI:
[19]
Kaiyuan Guo, Shulin Zeng, Jincheng Yu, Yu Wang, and Huazhong Yang. 2019. [DL] A survey of FPGA-based neural network inference accelerators. ACM Transactions on Reconfigurable Technology and Systems (TRETS) 12, 1 (2019), 1–26. DOI:
[20]
Fred G. Gustavson. 1978. Two fast algorithms for sparse matrices: Multiplication and permuted transposition. ACM Transactions on Mathematical Software (TOMS) 4, 3 (1978), 250–269. DOI:
[21]
Xin He, Subhankar Pal, Aporva Amarnath, Siying Feng, Dong-Hyeon Park, Austin Rovinski, Haojie Ye, Yuhan Chen, Ronald Dreslinski, and Trevor Mudge. 2020. Sparse-TPU: Adapting systolic arrays for sparse matrices. In 34th ACM International Conference on Supercomputing (ICS). ACM, Barcelona, Spain, 1–12. DOI:
[22]
Reza Hojabr, Ali Sedaghati, Amirali Sharifian, Ahmad Khonsari, and Arrvindh Shriraman. 2021. SPAGHETTI: Streaming accelerators for highly sparse GEMM on FPGAs. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, Seoul, Korea (South), 84–96. DOI:
[23]
Mohammad Hosseinabady and Jose Luis Nunez-Yanez. 2019. A streaming dataflow engine for sparse matrix-vector multiplication using high-level synthesis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 39, 6 (2019), 1272–1285. DOI:
[24]
Mohammad Hosseinabady, Mohd Amiruddin Bin Zainol, and Jose Nunez-Yanez. 2019. Heterogeneous FPGA+ GPU embedded systems: Challenges and opportunities. arXiv:1901.06331, 1–10. Retrieved from https://doi.org/10.48550/arXiv.1901.06331
[25]
Te C. Hu. 1961. Parallel sequencing and assembly line problems. Operations Research 9, 6 (1961), 841–848. DOI:
[26]
Yuwei Hu, Yixiao Du, Ecenur Ustun, and Zhiru Zhang. 2021. GraphLily: Accelerating graph linear algebra on HBM-equipped FPGAs. In 2021 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, Munich, Germany, 1–9. DOI:
[27]
Satoshi Itoh, Pablo Ordejón, and Richard M. Martin. 1995. Order-N tight-binding molecular dynamics on parallel computers. Computer Physics Communications 88, 2–3 (1995), 173–185. DOI:
[28]
Abhishek Kumar Jain, Chirag Ravishankar, Hossein Omidian, Sharan Kumar, Maithilee Kulkarni, Aashish Tripathi, and Dinesh Gaitonde. 2023. Modular and lean architecture with elasticity for sparse matrix vector multiplication on FPGAs. In 2023 IEEE 31st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, Marina Del Rey, CA, USA, 133–143. DOI:
[29]
Chao Jiang, David Ojika, Bhavesh Patel, and Herman Lam. 2021. Optimized FPGA-based deep learning accelerator for sparse CNN using high bandwidth memory. In 2021 IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, Orlando, FL, USA, 157–164. DOI:
[30]
Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg, John Hu, Robert Hundt, Dan Hurt, Julian Ibarz, Aaron Jaffey, Alek Jaworski, Alexander Kaplan, Harshit Khaitan, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, Diemthu Le, Chris Leary, Zhuyuan Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kieran Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, Jonathan Ross, Matt Ross, Amir Salek, Emad Samadiani, Chris Severn, Gregory Sizikov, Matthew Snelham, Jed Souter, Dan Steinberg, Andy Swing, Mercedes Tan, Gregory Thorson, Bo Tian, Horia Toma, Erick Tuttle, Vijay Vasudevan, Richard Walter, Walter Wang, Eric Wilcox, and Doe Hyun Yoon. 2017. In-datacenter performance analysis of a tensor processing unit. In 44th Annual International Symposium on Computer Architecture (ISCA). ACM, Toronto, Canada, 1–12. DOI:
[31]
Jeremy Kepner, Simon Alford, Vijay Gadepally, Michael Jones, Lauren Milechin, Ryan Robinett, and Sid Samsi. 2019. Sparse deep neural network graph challenge. In 2019 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, Waltham, MA, USA, 1–7. DOI:
[32]
Srinidhi Kestur, John D. Davis, and Eric S. Chung. 2012. Towards a universal FPGA matrix-vector multiplication architecture. In 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, Toronto, ON, Canada, 9–16. DOI:
[33]
Shiqing Li, Shuo Huai, and Weichen Liu. 2023a. An efficient Gustavson-based sparse matrix–matrix multiplication accelerator on embedded FPGAs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 42, 12 (2023), 4671–4680. DOI:
[34]
Shiqing Li, Di Liu, and Weichen Liu. 2021. Optimized data reuse via reordering for sparse matrix-vector multiplication on FPGAs. In 2021 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, Munich, Germany, 1–9. DOI:
[35]
Shiqing Li, Di Liu, and Weichen LiuDi Liu. 2023b. Efficient FPGA-based sparse matrix–vector multiplication with data reuse-aware compression. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 42, 12 (2023), 4606–4617. DOI:
[36]
Shiqing Li and Weichen Liu. 2023. Accelerating Gustavson-based SpMM on embedded FPGAs with element-wise parallelism and access pattern-aware caches. In 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, Antwerp, Belgium, 1–6. DOI:
[37]
Tao Li, Li Shen, and Shangshang Yao. 2022. A high-performance SpMV accelerator on HBM-equipped FPGAs. In 2022 IEEE 24th International Conference on High Performance Computing & Communications; 8th International Conference on Data Science & Systems; 20th International Conference on Smart City; 8th International Conference on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys). IEEE, Hainan, China, 1081–1087. DOI:
[38]
Hui-Hsin Liao, Chao-Lin Lee, Jenq-Kuen Lee, Wei-Chih Lai, Ming-Yu Hung, and Chung-Wen Huang. 2021. Support convolution of CNN with compression sparse matrix multiplication flow in TVM. In 50th International Conference on Parallel Processing Workshop. ACM, New York, NY, 1–7. DOI:
[39]
Valerii Likhosherstov, Krzysztof Choromanski, and Adrian Weller. 2023. On the expressive flexibility of self-attention matrices. In AAAI Conference on Artificial Intelligence. PKP, Washington DC, USA, 8773–8781. DOI:
[40]
Bowen Liu and Dajiang Liu. 2023. Towards high-bandwidth-utilization SpMV on FPGAs via partial vector duplication. In 28th Asia and South Pacific Design Automation Conference (ASP-DAC). ACM, New York, NY, 33–38. DOI:
[41]
Junhong Liu, Xin He, Weifeng Liu, and Guangming Tan. 2019. Register-aware optimizations for parallel sparse matrix–matrix multiplication. International Journal of Parallel Programming 47, 3 (2019), 403–417. DOI:
[42]
Xunyun Liu and Rajkumar Buyya. 2020. Resource management and scheduling in distributed stream processing systems: A taxonomy, review, and future directions. ACM Computing Surveys (CSUR) 53, 3 (2020), 1–41. DOI:
[43]
Uditnarayan Mandal and Arighna Deb. 2023. ReMCOO: An efficient representation of sparse matrix-vector multiplication. In 2023 IEEE Guwahati Subsection Conference (GCON). IEEE, Guwahati, India, 01–06. DOI:
[44]
Wendong Mao, Meiqi Wang, Xiaoru Xie, Xiao Wu, and Zhongfeng Wang. 2024. Hardware accelerator design for sparse DNN inference and training: A tutorial. IEEE Transactions on Circuits and Systems II: Express Briefs 71, 3 (2024), 1708–1714. DOI:
[45]
Aleka McAdams, Eftychios Sifakis, and Joseph Teran. 2010. A parallel multigrid Poisson solver for fluids simulation on large grids. In 2010 Eurographics/ACM SIGGRAPH Symposium on Computer Animation (SCA). Eurographics Association, Madrid, Spain, 65–73. DOI:
[46]
Yuyao Niu, Zhengyang Lu, Haonan Ji, Shuhui Song, Zhou Jin, and Weifeng Liu. 2022. TileSpGEMM: A tiled algorithm for parallel sparse general matrix-matrix multiplication on GPUs. In 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, Seoul, Republic of Korea, 90–106. DOI:
[47]
José Oliver, Carlos Álvarez, Teresa Cervero, Xavier Martorell, John D. Davis, and Eduard Ayguadé. 2023. Accelerating SpMV on FPGAs through block-row compress: A task-based approach. In 2023 33rd International Conference on Field-Programmable Logic and Applications (FPL). IEEE, Gothenburg, Sweden, 151–158. DOI:
[48]
Subhankar Pal, Jonathan Beaumont, Dong-Hyeon Park, Aporva Amarnath, Siying Feng, Chaitali Chakrabarti, Hun-Seok Kim, David Blaauw, Trevor Mudge, and Ronald Dreslinski. 2018. OuterSPACE: An outer product based sparse matrix multiplication accelerator. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, Vienna, 724–736. DOI:
[49]
Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkatesan, Brucek Khailany, Joel Emer, Stephen W. Keckler, and William J. Dally. 2017. SCNN: An accelerator for compressed-sparse convolutional neural networks. ACM SIGARCH Computer Architecture News 45, 2 (2017), 27–40. DOI:
[50]
Mathias Parger, Martin Winter, Daniel Mlakar, and Markus Steinberger. 2020. spECK: Accelerating GPU sparse matrix-matrix multiplication through lightweight analysis. In 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP). ACM, San Diego, CA, USA, 362–375. DOI:
[51]
Michail Pligouroudis, Rafael Angel Gutierrez Nuno, and Tom Kazmierski. 2020. Modified compressed sparse row format for accelerated FPGA-based sparse matrix multiplication. In 2020 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, Seville, Spain, 1–5. DOI:
[53]
Yousef Saad. 1992. Numerical Methods for Large Eigenvalue Problems. Manchester University Press, Manchester, UK.
[54]
Rishov Sarkar, Stefan Abi-Karam, Yuqi He, Lakshmi Sathidevi, and Cong Hao. 2023. FlowGNN: A dataflow architecture for real-time workload-agnostic graph neural network inference. In 2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, Montreal, QC, Canada, 1099–1112. DOI:
[55]
Santiago Segarra, Antonio G. Marques, Gonzalo Mateos, and Alejandro Ribeiro. 2017. Network topology inference from spectral templates. IEEE Transactions on Signal and Information Processing over Networks 3, 3 (2017), 467–483. DOI:
[56]
Mohammadreza Soltaniyeh, Richard P. Martin, and Santosh Nagarakatte. 2020. Synergistic CPU-FPGA acceleration of sparse linear algebra. arXiv:2004.13907, 1–12. Retrieved from https://doi.org/10.48550/arXiv.2004.13907
[57]
Linghao Song, Yuze Chi, Licheng Guo, and Jason Cong. 2022a. Serpens: A high bandwidth memory based accelerator for general-purpose sparse matrix-vector multiplication. In 59th ACM/IEEE Design Automation Conference (DAC). ACM, San Francisco, CA, USA, 211–216. DOI:
[58]
Linghao Song, Yuze Chi, Atefeh Sohrabizadeh, Young-kyu Choi, Jason Lau, and Jason Cong. 2022b. Sextans: A streaming accelerator for general-purpose sparse-matrix dense-matrix multiplication. In 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA). ACM, Virtual Event, USA, 65–77. DOI:
[59]
Nitish Srivastava, Hanchen Jin, Jie Liu, David Albonesi, and Zhiru Zhang. 2020. MatRaptor: A sparse-sparse matrix multiplication accelerator based on row-wise product. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, Athens, Greece, 766–780. DOI:
[60]
Zhuofu Tao, Chen Wu, Yuan Liang, Kun Wang, and Lei He. 2022. LW-GCN: A lightweight FPGA-based graph convolutional network accelerator. ACM Transactions on Reconfigurable Technology and Systems (TRETS) 16, 1 (2022), 1–19. DOI:
[61]
Erfan Bank Tavakoli, Michael Riera, Masudul Hassan Quraishi, and Fengbo Ren. 2024. FSpGEMM: A framework for accelerating sparse general matrix–matrix multiplication using Gustavson’s algorithm on FPGAs. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 32, 4 (2024), 633–644. DOI:
[62]
James Theiler, Guangzhi Cao, Leonardo R. Bachega, and Charles A. Bouman. 2011. Sparse matrix transform for hyperspectral image processing. IEEE Journal of Selected Topics in Signal Processing 5, 3 (2011), 424–437. DOI:
[63]
Minjie Wang, Da Zheng, Zihao Ye, Quan Gan, Mufei Li, Xiang Song, Jinjing Zhou, Chao Ma, Lingfan Yu, Yu Gai, Tianjun Xiao, Tong He, George Karypis, Jinyang Li, and Zheng Zhang. 2019. Deep graph library: A graph-centric, highly-performant package for graph neural networks. arXiv:1909.01315, 1–18. Retrieved from https://doi.org/10.48550/arXiv.1909.01315
[64]
Chen Wu, Zhuofu Tao, Kun Wang, and Lei He. 2022. SkeletonGCN: A simple yet effective accelerator for GCN training. In 2022 32nd International Conference on Field-Programmable Logic and Applications (FPL). IEEE, Belfast, United Kingdom, 445–451. DOI:
[67]
Shulin Zeng, Jun Liu, Guohao Dai, Xinhao Yang, Tianyu Fu, Hongyi Wang, Wenheng Ma, Hanbo Sun, Shiyao Li, Zixiao Huang, Yadong Dai, Jintao Li, Zehao Wang, Ruoyu Zhang, Kairui Wen, Xuefei Ning, and Yu Wang. 2024. FlightLLM: Efficient large language model inference with a complete mapping flow on FPGAs. In 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA). ACM, Monterey, CA, USA, 223–234. DOI:
[68]
Wei Zeng, Xiaozhe Ren, Teng Su, Hui Wang, Yi Liao, Zhiwei Wang, Xin Jiang, ZhenZhang Yang, Kaisheng Wang, Xiaoda Zhang, Chen Li, Ziyan Gong, Yifan Yao, Xinjing Huang, Jun Wang, Jianfeng Yu, Qi Guo, Yue Yu, Yan Zhang, Jin Wang, Hengtao Tao, Dasen Yan, Zexuan Yi, Fang Peng, Fangqing Jiang, Han Zhang, Lingfeng Deng, Yehong Zhang, Zhe Lin, Chao Zhang, Shaojie Zhang, Mingyue Guo, Shanzhi Gu, Gaojun Fan, Yaowei Wang, Xuefeng Jin, Qun Liu, and Yonghong Tian. 2021. PanGu-\(\alpha\): Large-scale autoregressive pretrained Chinese language models with auto-parallel computation. arXiv:2104.12369, 1–23. Retrieved from https://doi.org/10.48550/arXiv.2104.12369
[69]
Bingyi Zhang and Viktor K. Prasanna. 2023. Dynasparse: Accelerating GNN inference through dynamic sparsity exploitation. In IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, St. Petersburg, FL, USA, 233–244. DOI:
[70]
Guowei Zhang, Nithya Attaluri, Joel S. Emer, and Daniel Sanchez. 2021. Gamma: Leveraging Gustavson’s algorithm to accelerate sparse matrix multiplication. In 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, Virtual Event, USA, 687–701. DOI:
[71]
Zhekai Zhang, Hanrui Wang, Song Han, and William J. Dally. 2020. SpArch: Efficient architecture for sparse matrix multiplication. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, San Diego, CA, USA, 261–274. DOI:

Index Terms

  1. FPGA-Based Sparse Matrix Multiplication Accelerators: From State-of-the-Art to Future Opportunities

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Reconfigurable Technology and Systems
      ACM Transactions on Reconfigurable Technology and Systems  Volume 17, Issue 4
      December 2024
      303 pages
      EISSN:1936-7414
      DOI:10.1145/3613637
      • Editor:
      • Deming Chen
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 18 November 2024
      Online AM: 28 August 2024
      Accepted: 13 July 2024
      Revised: 11 June 2024
      Received: 21 January 2024
      Published in TRETS Volume 17, Issue 4

      Check for updates

      Author Tags

      1. Field-Programmable Gate Array (FPGA)
      2. Sparse Matrix Multiplication
      3. Compress Ratio
      4. Accelerator

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 1,036
        Total Downloads
      • Downloads (Last 12 months)1,036
      • Downloads (Last 6 weeks)319
      Reflects downloads up to 26 Jan 2025

      Other Metrics

      Citations

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media