skip to main content
10.5555/3571885.3571995acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Scalable linear time dense direct solver for 3-D problems without trailing sub-matrix dependencies

Published: 18 November 2022 Publication History

Abstract

Factorization of large dense matrices are ubiquitous in engineering and data science applications, e.g. preconditioners for iterative boundary integral solvers, frontal matrices in sparse multifrontal solvers, and computing the determinant of covariance matrices. HSS and H2-matrices are hierarchical low-rank matrix formats that can reduce the complexity of factorizing such dense matrices from O(N3) to O(N). For HSS matrices, it is possible to remove the dependency on the trailing matrices during Cholesky/LU factorization, which results in a highly parallel algorithm. However, the weak admissibility of HSS causes the rank of off-diagonal blocks to grow for 3-D problems, and the method is no longer O(N). On the other hand, the strong admissibility of H2-matrices allows it to handle 3-D problems in O(N), but introduces a dependency on the trailing matrices. In the present work, we pre-compute the fill-ins and integrate them into the shared basis, which allows us to remove the dependency on trailing-matrices even for H2-matrices. Comparisons with a block low-rank factorization code LORAPO showed a maximum speed up of 4,700x for a 3-D problem with complex geometry.

Supplementary Material

MP4 File (SC22_Presentation_Ma_Qianxiang.mp4)
Presentation at SC '22

References

[1]
T. Takahashi, C. Chen, and E. Darve, "Parallelization of the Inverse Fast Multipole Method with an Application to Boundary Element Method," Computer Physics Communications, vol. 247, p. 106975, Feb. 2020. [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S0010465519303194
[2]
Y. Liu, P. Ghysels, L. Claus, and X. S. Li, "Sparse Approximate Multifrontal Factorization with Butterfly Compression for High Frequency Wave Equations," SIAM Journal on Scientific Computing, vol. 43, no. 5, pp. S367-S391, 2021, arXiv: 2007.00202. [Online]. Available: http://arxiv.org/abs/2007.00202
[3]
P. R. Amestoy, A. Buttari, J.-Y. L'Excellent, and T. Mary, "Performance and Scalability of the Block Low-Rank Multifrontal Factorization on Multicore Architectures," ACM Transactions on Mathematical Software, vol. 45, no. 1, pp. 1--26, Mar. 2019, number: 1. [Online].
[4]
A. Litvinenko, Y Sun, M. G. Genton, and D. E. Keyes, "Likelihood Approximation with Hierarchical Matrices for Large Spatial Datasets," Computational Statistics and Data Analysis, vol. 137, pp. 115--132, 2019.
[5]
S. Chandrasekaran, P. Dewilde, M. Gu, W. Lyons, and T. Pals, "A Fast Solver for HSS Representations via Sparse Matrices etc." SIAM Journal on Matrix Analysis and Applications, vol. 29, no. 1, pp. 67--81, May 2006, number: 1.
[6]
W. Hackbusch, B. Khoromskij, and S. A. Sauter, "On $H^2$-Matrices," in Lectures on Applied Mathematics, H. Bungartz, R. Hoppe, and C. Zenger, Eds. Springer Berlin Heidelberg, 2000.
[7]
S. Ambikasaran, "Fast Algorithms for Dense Numerical Linear Algebra and Applications," PhD Thesis, Stanford University, 2013.
[8]
W. Hackbusch, "A Sparse Matrix Arithmetic Based on H-Matrices, Part I: Introduction to H-Matrices," Computing, vol. 62, pp. 89--108, 1999.
[9]
P. Amestoy, C. Ashcraft, O. Boiteau, A. Buttari, J.-Y. L'Excellent, and C. Weisbecker, "Improving Multifrontal Methods by Means of Block Low-Rank Representations," SIAM Journal on Scientific Computing, vol. 37, no. 3, pp. A1451-A1474, 2015, number: 3.
[10]
C. Ashcraft, A. Buttari, and T. Mary, "Block Low-Rank Matrices with Shared Bases: Potential and Limitations of the BLR$^2$ Format," SIAM Journal on Matrix Analysis and Applications, vol. 42, no. 2, pp. 990--1010, Jan. 2021. [Online].
[11]
S. Chandrasekaran, M. Gu, and T. Pals, "A Fast ULV Decomposition Solver for Hierarchically Semiseparable Representations," SIAM Journal on Matrix Analysis and Applications, vol. 28, no. 3, pp. 603--622, 2006.
[12]
C. Augonnet, S. Thibault, R. Namyst, and P.-A. Wacrenier, "StarPU: a Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures," Concurrency and Computation: Practice and Experience, vol. 23, pp. 187--198, 2011.
[13]
G. Bosilca, A. Bouteiller, A. Danalis, T. Herault, P. Lemarinier, and J. Dongarra, "DAGuE: A generic distributed DAG engine for High Performance Computing," Parallel Computing, vol. 38, no. 1--2, pp. 37--51, Jan. 2012. [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S0167819111001347
[14]
P. G. Martinsson, V. Rokhlin, and M. Tygert, "A Randomized Algorithm for the Decomposition of Matrices," Applied and Computational Harmonic Analysis, vol. 30, pp. 47--68, 2011.
[15]
C. D. Yu, S. Reiz, and G. Biros, "Distributed-Memory Hierarchical Compression of Dense SPD Matrices," in Proceedings of the 2018 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, 2018.
[16]
F.-H. Rouet, X. S. Li, P. Ghysels, and A. Napov, "A distributed-memory package for dense Hierarchically Semi-Separable matrix computations using randomization," arXiv:1503.05464 [cs], Jun. 2015, arXiv: 1503.05464. [Online]. Available: http://arxiv.org/abs/1503.05464
[17]
M. Ma and D. Jiao, "Direct Solution of General H2 - Matrices With Controlled Accuracy and Concurrent Change of Cluster Bases for Electromagnetic Analysis," IEEE Transactions on Microwave Theory and Techniques, vol. 67, no. 6, pp. 2114--2127, Jun. 2019, conference Name: IEEE Transactions on Microwave Theory and Techniques.
[18]
S. Borm, M. Lopez-Fernandez, and S. Sauter, "Variable Order, Directional $\mathcal{H}^2$-Matrices for Helmholtz Problems with Complex Frequency," arXiv:1903.02803 [math], Mar. 2019, arXiv: 1903.02803. [Online]. Available: http://arxiv.org/abs/1903.02803
[19]
R. Vandebril and M. V. Barel, "A Note on the Nullity Theorem," Journal of Computational and Applied Mathematics, vol. 189, pp. 179--190, 2006.
[20]
S. Ambikasaran and E. Darve, "The Inverse Fast Multipole Method," arXiv:1407.1572v1, 2014.
[21]
K. L. Ho and L. Ying, "Hierarchical Interpolative Factorization for Elliptic Operators: Differential Equations," Communications on Pure and Applied Mathematics, vol. 69, no. 8, pp. 1415--1451, 2016, number: 8.
[22]
L. Cambier and E. Darve, "A task-based distributed parallel sparsified nested dissection algorithm," in Proceedings of the Platform for Advanced Scientific Computing Conference. Geneva Switzerland: ACM, Jul. 2021, pp. 1--11, cambierTaskBasedDistributed2021. [Online].
[23]
Qinglei Cao, Y. Pei, K. Akbudak, A. Mikhalev, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, "Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications," in Proceedings of the Platform for Advanced Scientific Computing Conference, ser. PASC '20. New York, NY, USA: Association for Computing Machinery, Jun. 2020, pp. 1--11. [Online].
[24]
Q. Cao, Y. Pei, K. Akbudak, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, "Leveraging PaRSEC Runtime Support to Tackle Challenging 3D Data-Sparse Matrix Problems," in 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS). Portland, OR, USA: IEEE, May 2021, pp. 79--89. [Online]. Available: https://ieeexplore.ieee.org/document/9460493/
[25]
S. Abdulah, Q. Cao, Y. Pei, G. Bosilca, J. Dongarra, M. M. Genton, D. Keyes, H. Ltaief, and Y. Sun, "Accelerating Geostatistical Modeling and Prediction With Mixed-Precision Computations: A High-Productivity Approach with PaRSEC," IEEE Transactions on Parallel and Distributed Systems, pp. 1--1, 2021. [Online]. Available: https://ieeexplore.ieee.org/document/9442267/

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '22: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
November 2022
1277 pages
ISBN:9784665454445

Sponsors

In-Cooperation

  • IEEE CS

Publisher

IEEE Press

Publication History

Published: 18 November 2022

Check for updates

Badges

Author Tags

  1. H2-matrix
  2. LU
  3. ULV
  4. dense direct solver

Qualifiers

  • Research-article

Conference

SC '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)2
Reflects downloads up to 14 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media