research-article

Improving matrix-vector multiplication via lossless grammar-compressed matrices

Authors:

Paolo Ferragina,

Giovanni Manzini,

Dominik Köppl,

Gonzalo Navarro,

Manuel Striani,

Francesco TosoniAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 15, Issue 10

Pages 2175 - 2187

https://doi.org/10.14778/3547305.3547321

Published: 01 June 2022 Publication History

Abstract

As nowadays Machine Learning (ML) techniques are generating huge data collections, the problem of how to efficiently engineer their storage and operations is becoming of paramount importance. In this article we propose a new lossless compression scheme for real-valued matrices which achieves efficient performance in terms of compression ratio and time for linear-algebra operations. Experiments show that, as a compressor, our tool is clearly superior to gzip and it is usually within 20% of xz in terms of compression ratio. In addition, our compressed format supports matrix-vector multiplications in time and space proportional to the size of the compressed representation, unlike gzip and xz that require the full decompression of the compressed matrix. To our knowledge our lossless compressor is the first one achieving time and space complexities which match the theoretical limit expressed by the k-th order statistical entropy of the input.

To achieve further time/space reductions, we propose column-reordering algorithms hinging on a novel column-similarity score. Our experiments on various data sets of ML matrices show that our column reordering can yield a further reduction of up to 16% in the peak memory usage during matrix-vector multiplication.

Finally, we compare our proposal against the state-of-the-art Compressed Linear Algebra (CLA) approach showing that ours runs always at least twice faster (in a multi-thread setting), and achieves better compressed space occupancy and peak memory usage. This experimentally confirms the provably effective theoretical bounds we show for our compressed-matrix approach.

References

[1]

Alberto Apostolico, Fabio Cunial, and Vineith Kaul. 2008. Table Compression by Record Intersections. In 2008 Data Compression Conference (DCC 2008), 25-27 March 2008. IEEE Computer Society, Snowbird, UT, USA, 13--22.

Digital Library

[2]

ASA, Statistical Computing & Graphics Sections 2009. Data Expo 2009 - Airline on-time performance. https://community.amstat.org/jointscsg-section/dataexpo/dataexpo2009. [Online; accessed 21-Sep-2021].

[3]

Arindam Banerjee, Inderjit S. Dhillon, Joydeep Ghosh, Srujana Merugu, and Dharmendra S. Modha. 2007. A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximation. J. Mach. Learn. Res. 8 (2007), 1919--1986.

Digital Library

[4]

Leon Bottou. 2007. The infinite MNIST dataset. https://leon.bottou.org/projects/infimnist. https://leon.bottou.org/projects/infimnist [Online; accessed 21-Sep-2021].

[5]

Alan L. Buchsbaum, Glenn S. Fowler, and Raffaele Giancarlo. 2003. Improving Table Compression with Combinatorial Optimization. J. ACM 50 (2003), 825--851.

Digital Library

[6]

Deepayan Chakrabarti, Spiros Papadimitriou, Dharmendra S. Modha, and Christos Faloutsos. 2004. Fully automatic cross-associations. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 22-25, 2004. ACM, Seattle, Washington, USA, 79--88.

Digital Library

[7]

Moses Charikar, Eric Lehman, Ding Liu, Rina Panigrahy, Manoj Prabhakaran, Amit Sahai, and Abhi Shelat. 2005. The smallest grammar problem. IEEE Trans. Inf. Theory 51, 7 (2005), 2554--2576.

Digital Library

[8]

Radha Chitta, Rong Jin, Timothy C. Havens, and Anil K. Jain. 2011. Approximate Kernel K-Means: Solution to Large Scale Kernel Clustering. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Diego, California, USA) (KDD '11). Association for Computing Machinery, New York, NY, USA, 895--903.

Digital Library

[9]

Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms, 3rd Edition. MIT Press, One Broadway, 12th Floor, Cambridge, MA 02142.

Digital Library

[10]

Inderjit S. Dhillon. 2001. Co-clustering documents and words using bipartite spectral graph partitioning. In KDD. ACM, 269--274.

[11]

Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml. http://archive.ics.uci.edu/ml [Online; accessed 21-Sep-2021].

[12]

Ahmed Elgohary, Matthias Boehm, Peter J. Haas, Frederick R. Reiss, and Berthold Reinwald. 2018. Compressed linear algebra for large-scale machine learning. VLDB J. 27, 5 (2018), 719--744.

Digital Library

[13]

Ahmed Elgohary, Matthias Boehm, Peter J. Haas, Frederick R. Reiss, and Berthold Reinwald. 2019. Compressed linear algebra for declarative large-scale machine learning. Commun. ACM 62, 5 (2019), 83--91.

Digital Library

[14]

Alexandre Francisco, Travis Gagie, Susana Ladra, and Gonzalo Navarro. 2018. Exploiting computation-friendly graph compression methods for adjacency-matrix multiplication. In 2018 Data Compression Conference. IEEE Computer Society Press, 307--314.

[15]

Harold N. Gabow. 1976. An Efficient Implementation of Edmonds' Algorithm for Maximum Matching on Graphs. J. ACM 23, 2 (April 1976), 221--234.

Digital Library

[16]

Simon Gog, Timo Beller, Alistair Moffat, and Matthias Petri. 2014. From Theory to Practice: Plug and Play with Succinct Data Structures. In 13th International Symposium on Experimental Algorithms, (SEA 2014) (LNCS), Vol. 8504. Springer, 326--337.

Digital Library

[17]

Bo Han and Bolang Li. 2016. Lossless Compression of Data Tables in Mobile Devices by Using Co-clustering. Int. J. Comput. Commun. Control 11, 6 (2016), 776--788.

[18]

Keld Helsgaun. 2000. An effective implementation of the Lin---Kernighan traveling salesman heuristic. European Journal of Operational Research 126, 1 (2000), 106--130.

[19]

Keld Helsgaun. 2009. General k-opt submoves for the Lin---Kernighan TSP heuristic. Mathematical Programming Computation 1, 2 (01 Oct 2009), 119--163.

[20]

David S. Johnson, Shankar Krishnan, Jatin Chhugani, Subodh Kumar, and Suresh Venkatasubramanian. 2004. Compressing Large Boolean Matrices using Reordering Techniques. In VLDB. Morgan Kaufmann, 13--23.

[21]

John C. Kieffer and En-Hui Yang. 2000. Grammar-based codes: A new class of universal lossless source codes. IEEE Trans. Inf. Theory 46, 3 (2000), 737--754.

Digital Library

[22]

Kornilios Kourtis, Georgios I. Goumas, and Nectarios Koziris. 2008. Optimizing sparse matrix-vector multiplication using index and value compression. In Conf. Computing Frontiers. ACM, 87--96.

Digital Library

[23]

Jesper Larsson and Alistair Moffat. 2000. Off-line dictionary-based compression. Proc. IEEE 88, 11 (2000), 1722--1732.

[24]

Markus Lohrey. 2012. Algorithmics on SLP-compressed strings: A survey. Groups Complex. Cryptol. 4, 2 (2012), 241--299.

[25]

Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to information retrieval. Cambridge University Press.

[26]

Giovanni Manzini. 2021. A Collection of Some Machine Learning Matrices. https://www.kaggle.com/giovannimanzini/some-machine-learning-matrices. Version 3.

[27]

Kurt Mehlhorn and Peter Sanders. 2008. Algorithms and Data Structures: The Basic Toolbox. Springer Berlin Heidelberg, Berlin, Heidelberg, Chapter 11, 217--232.

[28]

Alistair Moffat and Matthias Petri. 2020. Large-Alphabet Semi-Static Entropy Coding Via Asymmetric Numeral Systems. ACM Trans. Inf. Syst. 38, 4 (2020), 33:1--33:33.

Digital Library

[29]

Gonzalo Navarro. 2021. Indexing Highly Repetitive String Collections, Part I: Repetitiveness Measures. ACM Comput. Surv. 54, 2 (2021), 29:1--29:31.

[30]

Carlos Ochoa and Gonzalo Navarro. 2019. RePair and All Irreducible Grammars are Upper Bounded by High-Order Empirical Entropy. IEEE Trans. Inf. Theory 65, 5 (2019), 3160--3164.

[31]

Yousef Saad. 2003. Iterative methods for sparse linear systems. SIAM.

[32]

Jia Shi. 2020. Column Partition and Permutation for Run Length Encoding in Columnar Databases. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 2873--2874.

Digital Library

[33]

James A. Storer and Thomas G. Szymanski. 1982. Data compression via textual substitution. J. ACM 29, 4 (1982), 928--951.

Digital Library

[34]

The Apache Software Foundation 2021. Apache SystemDS. https://systemds.apache.org/. [Online; accessed 14-Dec-2021].

[35]

Binh Dao Vo and Kiem-Phong Vo. 2007. Compressing table data with column dependency. Theoretical Computer Science 387, 3 (2007), 273--283.

Digital Library

Cited By

Baunsgaard SBoehm M(2023)AWARE: Workload-aware, Redundancy-exploiting Linear AlgebraProceedings of the ACM on Management of Data10.1145/35886821:1(1-28)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.1145/3588682

Improving matrix-vector multiplication via lossless grammar-compressed matrices
1. Information systems
  1. Data management systems
    1. Data structures
      1. Data layout

Recommendations

Lossless compression of already compressed textures
HPG '11: Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics

Texture compression helps rendering by reducing the footprint in graphics memory, thus allowing for more textures, and by lowering the number of memory accesses between the graphics processor and memory, increasing performance and lowering power ...
A note on computing matrix-vector products with generalized centrosymmetric (centrohermitian) matrices

To reduce the costs of computing matrix-vector product Ax related to a centrosymmetric matrix A as compared to the case of an arbitrary matrix A, two algorithms were proposed recently, one was designed by Melman [A. Melman, Symmetric centrosymmetric ...
On lossless and lossy compression of step size matrices in JPEG coding
ICNC '13: Proceedings of the 2013 International Conference on Computing, Networking and Communications (ICNC)

Many image or video coding standards that are based on discrete cosine transform (DCT) rely on a step size matrix to normalize the DCT coefficients prior to quantization. The JPEG lossy image coding standard, for example, transmits the step size matrix ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 15, Issue 10

June 2022

319 pages

ISSN:2150-8097

Editors:
Fatma Özcan
Google
,
Juliana Freire
New York University
,
Xuemin Lin
University of New South Wales

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 June 2022

Published in PVLDB Volume 15, Issue 10

Badges

Artifacts Available / v1.1

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
36
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)0

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Baunsgaard SBoehm M(2023)AWARE: Workload-aware, Redundancy-exploiting Linear AlgebraProceedings of the ACM on Management of Data10.1145/35886821:1(1-28)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.1145/3588682

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents