skip to main content
research-article

Compressed linear algebra for large-scale machine learning

Published: 01 August 2016 Publication History

Abstract

Large-scale machine learning (ML) algorithms are often iterative, using repeated read-only data access and I/O-bound matrix-vector multiplications to converge to an optimal model. It is crucial for performance to fit the data into single-node or distributed main memory. General-purpose, heavy- and lightweight compression techniques struggle to achieve both good compression ratios and fast decompression speed to enable block-wise uncompressed operations. Hence, we initiate work on compressed linear algebra (CLA), in which lightweight database compression techniques are applied to matrices and then linear algebra operations such as matrix-vector multiplication are executed directly on the compressed representations. We contribute effective column compression schemes, cache-conscious operations, and an efficient sampling-based compression algorithm. Our experiments show that CLA achieves in-memory operations performance close to the uncompressed case and good compression ratios that allow us to fit larger datasets into available memory. We thereby obtain significant end-to-end performance improvements up to 26x or reduced memory requirements.

References

[1]
M. Abadi et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. CoRR, 2016.
[2]
A. Alexandrov et al. The Stratosphere Platform for Big Data Analytics. VLDB J., 23(6), 2014.
[3]
A. Ashari et al. An Efficient Two-Dimensional Blocking Strategy for Sparse Matrix-Vector Multiplication on GPUs. In ICS (Intl. Conf. on Supercomputing), 2014.
[4]
A. Ashari et al. On Optimizing Machine Learning Workloads via Kernel Fusion. In PPoPP (Principles and Practice of Parallel Programming), 2015.
[5]
M. A. Bassiouni. Data Compression in Scientific and Statistical Databases. TSE (Trans. SW Eng.), 11(10), 1985.
[6]
N. Bell and M. Garland. Implementing Sparse Matrix-Vector Multiplication on Throughput-Oriented Processors. In SC (Supercomputing Conf.), 2009.
[7]
J. Bergstra et al. Theano: a CPU and GPU Math Expression Compiler. In SciPy, 2010.
[8]
K. S. Beyer et al. On Synopses for Distinct-Value Estimation Under Multiset Operations. In SIGMOD, 2007.
[9]
B. Bhattacharjee et al. Efficient Index Compression in DB2 LUW. PVLDB, 2(2), 2009.
[10]
S. Bhattacherjee et al. PStore: An Efficient Storage Framework for Managing Scientific Data. In SSDBM, 2014.
[11]
C. Binnig et al. Dictionary-based Order-preserving String Compression for Main Memory Column Stores. In SIGMOD, 2009.
[12]
M. Boehm et al. Declarative Machine Learning -- A Classification of Basic Properties and Types. CoRR, 2016.
[13]
L. Bottou. The infinite MNIST dataset. http://leon.bottou.org/projects/infimnist.
[14]
M. Charikar et al. Towards Estimation Error Guarantees for Distinct Values. In SIGMOD, 2000.
[15]
R. Chitta et al. Approximate Kernel k-means: Solution to Large Scale Kernel Clustering. In KDD, 2011.
[16]
J. Cohen et al. MAD Skills: New Analysis Practices for Big Data. PVLDB, 2(2), 2009.
[17]
C. Constantinescu and M. Lu. Quick Estimation of Data Compression and De-duplication for Large Storage Systems. In CCP (Data Compression, Comm. and Process.), 2011.
[18]
G. V. Cormack. Data Compression on a Database System. Commun. ACM, 28(12), 1985.
[19]
S. Das et al. Ricardo: Integrating R and Hadoop. In SIGMOD, 2010.
[20]
J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In OSDI, 2004.
[21]
A. Ghoting et al. SystemML: Declarative Machine Learning on MapReduce. In ICDE, 2011.
[22]
I. J. Good. The Population Frequencies of Species and the Estimation of Population Parameters. Biometrika, 1953.
[23]
G. Graefe and L. D. Shapiro. Data Compression and Database Performance. In Applied Computing, 1991.
[24]
P. J. Haas and L. Stokes. Estimating the Number of Classes in a Finite Population. J. Amer. Statist. Assoc., 93(444), 1998.
[25]
D. Harnik et al. Estimation of Deduplication Ratios in Large Data Sets. In MSST (Mass Storage Sys. Tech.), 2012.
[26]
D. Harnik et al. To Zip or not to Zip: Effective Resource Usage for Real-Time Compression. In FAST, 2013.
[27]
B. Huang et al. Cumulon: Optimizing Statistical Data Analysis in the Cloud. In SIGMOD, 2013.
[28]
B. Huang et al. Resource Elasticity for Large-Scale Machine Learning. In SIGMOD, 2015.
[29]
S. Idreos et al. Estimating the Compression Fraction of an Index using Sampling. In ICDE, 2010.
[30]
N. L. Johnson et al. Univariate Discrete Distributions. Wiley, New York, 2nd edition, 1992.
[31]
V. Karakasis et al. An Extended Compression Format for the Optimization of Sparse Matrix-Vector Multiplication. TPDS (Trans. Par. and Dist. Systems), 24(10), 2013.
[32]
D. Kernert et al. SLACID - Sparse Linear Algebra in a Column-Oriented In-Memory Database System. In SSDBM, 2014.
[33]
H. Kimura et al. Compression Aware Physical Database Design. PVLDB, 4(10), 2011.
[34]
K. Kourtis et al. Optimizing Sparse Matrix-Vector Multiplication Using Index and Value Compression. In CF (Computing Frontiers), 2008.
[35]
H. Lang et al. Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation. In SIGMOD, 2016.
[36]
P. Larson et al. SQL Server Column Store Indexes. In SIGMOD, 2011.
[37]
M. Lichman. UCI Machine Learning Repository: Higgs, Covertype, US Census (1990). archive.ics.uci.edu/ml/.
[38]
P. E. O'Neil. Model 204 Architecture and Performance. In High Performance Transaction Systems. 1989.
[39]
Oracle. Data Warehousing Guide, 11g Release 1, 2007.
[40]
V. Raman and G. Swart. How to Wring a Table Dry: Entropy Compression of Relations and Querying of Compressed Relations. In VLDB, 2006.
[41]
V. Raman et al. DB2 with BLU Acceleration: So Much More than Just a Column Store. PVLDB, 6(11), 2013.
[42]
Y. Saad. SPARSKIT: a basic tool kit for sparse matrix computations - Version 2, 1994.
[43]
M. Stonebraker et al. C-Store: A Column-oriented DBMS. In VLDB, 2005.
[44]
M. Stonebraker et al. The Architecture of SciDB. In SSDBM, 2011.
[45]
Sysbase. IQ 15.4 System Administration Guide, 2013.
[46]
G. Valiant and P. Valiant. Estimating the Unseen: An n/log(n)-sample Estimator for Entropy and Support Size, Shown Optimal via New CLTs. In STOC, 2011.
[47]
T. Westmann et al. The Implementation and Performance of Compressed Databases. SIGMOD Record, 29(3), 2000.
[48]
S. Williams et al. Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms. In SC (Supercomputing Conf.), 2007.
[49]
K. Wu et al. Optimizing Bitmap Indices With Efficient Compression. TODS, 31(1), 2006.
[50]
L. Yu et al. Exploiting Matrix Dependency for Efficient Distributed Matrix Computation. In SIGMOD, 2015.
[51]
M. Zaharia et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In NSDI, 2012.
[52]
C. Zhang et al. Materialization Optimizations for Feature Selection Workloads. In SIGMOD, 2014.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 9, Issue 12
August 2016
345 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 August 2016
Published in PVLDB Volume 9, Issue 12

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)85
  • Downloads (Last 6 weeks)7
Reflects downloads up to 14 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media