research-article

The Singular Value Decomposition: : Anatomy of Optimizing an Algorithm for Extreme Scale

Authors:

Ichitaro YamazakiAuthors Info & Claims

SIAM Review, Volume 60, Issue 4

Pages 808 - 865

https://doi.org/10.1137/17M1117732

Published: 01 January 2018 Publication History

Abstract

The computation of the singular value decomposition, or SVD, has a long history with many improvements over the years, both in its implementations and algorithmically. Here, we survey the evolution of SVD algorithms for dense matrices, discussing the motivation and performance impacts of changes. There are two main branches of dense SVD methods: bidiagonalization and Jacobi. Bidiagonalization methods started with the implementation by Golub and Reinsch in Algol60, which was subsequently ported to Fortran in the EISPACK library, and was later more efficiently implemented in the LINPACK library, targeting contemporary vector machines. To address cache-based memory hierarchies, the SVD algorithm was reformulated to use Level 3 BLAS in the LAPACK library. To address new architectures, ScaLAPACK was introduced to take advantage of distributed computing, and MAGMA was developed for accelerators such as GPUs. Algorithmically, the divide and conquer and MRRR algorithms were developed to reduce the number of operations. Still, these methods remained memory bound, so two-stage algorithms were developed to reduce memory operations and increase the computational intensity, with efficient implementations in PLASMA, DPLASMA, and MAGMA. Jacobi methods started with the two-sided method of Kogbetliantz and the one-sided method of Hestenes. They have likewise had many developments, including parallel and block versions and preconditioning to improve convergence. In this paper, we investigate the impact of these changes by testing various historical and current implementations on a common, modern multicore machine and a distributed computing platform. We show that algorithmic and implementation improvements have increased the speed of the SVD by several orders of magnitude, while using up to 40 times less energy.

References

[1]

E. Agullo, B. Hadri, H. Ltaief, and J. Dongarrra, Comparative study of one-sided factorizations with multiple software packages on multi-core hardware, in Proceedings of the Conference on High Performance Computing, Networking, Storage and Analysis (SC'09), ACM, 2009, art. 20, https://doi.org/10.1145/1654059.1654080.

Abstract

References

Cited By

Index Terms

Recommendations

The QLP Approximation to the Singular Value Decomposition

Accurate Singular Values of Bidiagonal Matrices

Singular Value Decompositions for Single-Curl Operators in Three-Dimensional Maxwell's Equations for Complex Media

Comments

Information

Published In

Publisher

Publication History

Author Tags

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Share

Share this Publication link

Share on social media

Affiliations