skip to main content
research-article

Improving matrix-vector multiplication via lossless grammar-compressed matrices

Published: 01 June 2022 Publication History

Abstract

As nowadays Machine Learning (ML) techniques are generating huge data collections, the problem of how to efficiently engineer their storage and operations is becoming of paramount importance. In this article we propose a new lossless compression scheme for real-valued matrices which achieves efficient performance in terms of compression ratio and time for linear-algebra operations. Experiments show that, as a compressor, our tool is clearly superior to gzip and it is usually within 20% of xz in terms of compression ratio. In addition, our compressed format supports matrix-vector multiplications in time and space proportional to the size of the compressed representation, unlike gzip and xz that require the full decompression of the compressed matrix. To our knowledge our lossless compressor is the first one achieving time and space complexities which match the theoretical limit expressed by the k-th order statistical entropy of the input.
To achieve further time/space reductions, we propose column-reordering algorithms hinging on a novel column-similarity score. Our experiments on various data sets of ML matrices show that our column reordering can yield a further reduction of up to 16% in the peak memory usage during matrix-vector multiplication.
Finally, we compare our proposal against the state-of-the-art Compressed Linear Algebra (CLA) approach showing that ours runs always at least twice faster (in a multi-thread setting), and achieves better compressed space occupancy and peak memory usage. This experimentally confirms the provably effective theoretical bounds we show for our compressed-matrix approach.

References

[1]
Alberto Apostolico, Fabio Cunial, and Vineith Kaul. 2008. Table Compression by Record Intersections. In 2008 Data Compression Conference (DCC 2008), 25-27 March 2008. IEEE Computer Society, Snowbird, UT, USA, 13--22.
[2]
ASA, Statistical Computing & Graphics Sections 2009. Data Expo 2009 - Airline on-time performance. https://community.amstat.org/jointscsg-section/dataexpo/dataexpo2009. [Online; accessed 21-Sep-2021].
[3]
Arindam Banerjee, Inderjit S. Dhillon, Joydeep Ghosh, Srujana Merugu, and Dharmendra S. Modha. 2007. A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximation. J. Mach. Learn. Res. 8 (2007), 1919--1986.
[4]
Leon Bottou. 2007. The infinite MNIST dataset. https://leon.bottou.org/projects/infimnist. https://leon.bottou.org/projects/infimnist [Online; accessed 21-Sep-2021].
[5]
Alan L. Buchsbaum, Glenn S. Fowler, and Raffaele Giancarlo. 2003. Improving Table Compression with Combinatorial Optimization. J. ACM 50 (2003), 825--851.
[6]
Deepayan Chakrabarti, Spiros Papadimitriou, Dharmendra S. Modha, and Christos Faloutsos. 2004. Fully automatic cross-associations. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 22-25, 2004. ACM, Seattle, Washington, USA, 79--88.
[7]
Moses Charikar, Eric Lehman, Ding Liu, Rina Panigrahy, Manoj Prabhakaran, Amit Sahai, and Abhi Shelat. 2005. The smallest grammar problem. IEEE Trans. Inf. Theory 51, 7 (2005), 2554--2576.
[8]
Radha Chitta, Rong Jin, Timothy C. Havens, and Anil K. Jain. 2011. Approximate Kernel K-Means: Solution to Large Scale Kernel Clustering. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Diego, California, USA) (KDD '11). Association for Computing Machinery, New York, NY, USA, 895--903.
[9]
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms, 3rd Edition. MIT Press, One Broadway, 12th Floor, Cambridge, MA 02142.
[10]
Inderjit S. Dhillon. 2001. Co-clustering documents and words using bipartite spectral graph partitioning. In KDD. ACM, 269--274.
[11]
Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml. http://archive.ics.uci.edu/ml [Online; accessed 21-Sep-2021].
[12]
Ahmed Elgohary, Matthias Boehm, Peter J. Haas, Frederick R. Reiss, and Berthold Reinwald. 2018. Compressed linear algebra for large-scale machine learning. VLDB J. 27, 5 (2018), 719--744.
[13]
Ahmed Elgohary, Matthias Boehm, Peter J. Haas, Frederick R. Reiss, and Berthold Reinwald. 2019. Compressed linear algebra for declarative large-scale machine learning. Commun. ACM 62, 5 (2019), 83--91.
[14]
Alexandre Francisco, Travis Gagie, Susana Ladra, and Gonzalo Navarro. 2018. Exploiting computation-friendly graph compression methods for adjacency-matrix multiplication. In 2018 Data Compression Conference. IEEE Computer Society Press, 307--314.
[15]
Harold N. Gabow. 1976. An Efficient Implementation of Edmonds' Algorithm for Maximum Matching on Graphs. J. ACM 23, 2 (April 1976), 221--234.
[16]
Simon Gog, Timo Beller, Alistair Moffat, and Matthias Petri. 2014. From Theory to Practice: Plug and Play with Succinct Data Structures. In 13th International Symposium on Experimental Algorithms, (SEA 2014) (LNCS), Vol. 8504. Springer, 326--337.
[17]
Bo Han and Bolang Li. 2016. Lossless Compression of Data Tables in Mobile Devices by Using Co-clustering. Int. J. Comput. Commun. Control 11, 6 (2016), 776--788.
[18]
Keld Helsgaun. 2000. An effective implementation of the Lin---Kernighan traveling salesman heuristic. European Journal of Operational Research 126, 1 (2000), 106--130.
[19]
Keld Helsgaun. 2009. General k-opt submoves for the Lin---Kernighan TSP heuristic. Mathematical Programming Computation 1, 2 (01 Oct 2009), 119--163.
[20]
David S. Johnson, Shankar Krishnan, Jatin Chhugani, Subodh Kumar, and Suresh Venkatasubramanian. 2004. Compressing Large Boolean Matrices using Reordering Techniques. In VLDB. Morgan Kaufmann, 13--23.
[21]
John C. Kieffer and En-Hui Yang. 2000. Grammar-based codes: A new class of universal lossless source codes. IEEE Trans. Inf. Theory 46, 3 (2000), 737--754.
[22]
Kornilios Kourtis, Georgios I. Goumas, and Nectarios Koziris. 2008. Optimizing sparse matrix-vector multiplication using index and value compression. In Conf. Computing Frontiers. ACM, 87--96.
[23]
Jesper Larsson and Alistair Moffat. 2000. Off-line dictionary-based compression. Proc. IEEE 88, 11 (2000), 1722--1732.
[24]
Markus Lohrey. 2012. Algorithmics on SLP-compressed strings: A survey. Groups Complex. Cryptol. 4, 2 (2012), 241--299.
[25]
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to information retrieval. Cambridge University Press.
[26]
Giovanni Manzini. 2021. A Collection of Some Machine Learning Matrices. https://www.kaggle.com/giovannimanzini/some-machine-learning-matrices. Version 3.
[27]
Kurt Mehlhorn and Peter Sanders. 2008. Algorithms and Data Structures: The Basic Toolbox. Springer Berlin Heidelberg, Berlin, Heidelberg, Chapter 11, 217--232.
[28]
Alistair Moffat and Matthias Petri. 2020. Large-Alphabet Semi-Static Entropy Coding Via Asymmetric Numeral Systems. ACM Trans. Inf. Syst. 38, 4 (2020), 33:1--33:33.
[29]
Gonzalo Navarro. 2021. Indexing Highly Repetitive String Collections, Part I: Repetitiveness Measures. ACM Comput. Surv. 54, 2 (2021), 29:1--29:31.
[30]
Carlos Ochoa and Gonzalo Navarro. 2019. RePair and All Irreducible Grammars are Upper Bounded by High-Order Empirical Entropy. IEEE Trans. Inf. Theory 65, 5 (2019), 3160--3164.
[31]
Yousef Saad. 2003. Iterative methods for sparse linear systems. SIAM.
[32]
Jia Shi. 2020. Column Partition and Permutation for Run Length Encoding in Columnar Databases. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 2873--2874.
[33]
James A. Storer and Thomas G. Szymanski. 1982. Data compression via textual substitution. J. ACM 29, 4 (1982), 928--951.
[34]
The Apache Software Foundation 2021. Apache SystemDS. https://systemds.apache.org/. [Online; accessed 14-Dec-2021].
[35]
Binh Dao Vo and Kiem-Phong Vo. 2007. Compressing table data with column dependency. Theoretical Computer Science 387, 3 (2007), 273--283.

Cited By

View all
  1. Improving matrix-vector multiplication via lossless grammar-compressed matrices

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 15, Issue 10
    June 2022
    319 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    Published: 01 June 2022
    Published in PVLDB Volume 15, Issue 10

    Badges

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 06 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media