skip to main content
10.1145/3654823.3654895acmotherconferencesArticle/Chapter ViewAbstractPublication PagescacmlConference Proceedingsconference-collections
research-article

Parameterizations for Gradient-based Markov Chain Monte Carlo on the Stiefel Manifold: A Comparative Study

Published: 29 May 2024 Publication History

Abstract

Orthogonal matrices play an important role in probability and statistics, particularly in high-dimensional statistical models. Parameterizing these models using orthogonal matrices facilitates dimension reduction and parameter identification. However, establishing the theoretical validity of statistical inference in these models from a frequentist perspective is challenging, leading to a preference for Bayesian approaches because of their ability to offer consistent uncertainty quantification. Markov chain Monte Carlo methods are commonly used for numerical approximation of posterior distributions, and sampling on the Stiefel manifold, which comprises orthogonal matrices, poses significant difficulties. While various strategies have been proposed for this purpose, gradient-based Markov chain Monte Carlo with parameterizations is the most efficient. However, a comprehensive comparison of these parameterizations is lacking in the existing literature. This study aims to address this gap by evaluating numerical efficiency of the four alternative parameterizations of orthogonal matrices under equivalent conditions. The evaluation was conducted for four problems. The results suggest that polar expansion parameterization is the most efficient, particularly for the high-dimensional and complex problems. However, all parameterizations exhibit limitations in significantly high-dimensional or difficult tasks, emphasizing the need for further advancements in sampling methods for orthogonal matrices.

References

[1]
Alberto Abadie, Alexis Diamond, and Jens Hainmueller. 2010. Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Program. J. Amer. Statist. Assoc. 105, 490 (2010), 493–505. https://doi.org/10.1198/jasa.2009.ap08746
[2]
Alberto Abadie and Javier Gardeazabal. 2003. The Economic Costs of Conflict: A Case Study of the Basque Country. American Economic Review 93, 1 (March 2003), 113–132. https://doi.org/10.1257/000282803321455188
[3]
Julius J. Andersson. 2019. Carbon Taxes and CO2 Emissions: Sweden as a Case Study. American Economic Journal: Economic Policy 11, 4 (November 2019), 1–30. https://doi.org/10.1257/pol.20170144
[4]
S. Derin Babacan, Martin Luessi, Rafael Molina, and Aggelos K. Katsaggelos. 2012. Sparse Bayesian Methods for Low-Rank Matrix Estimation. IEEE Transactions on Signal Processing 60, 8 (2012), 3964–3977. https://doi.org/10.1109/TSP.2012.2197748
[5]
Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer.
[6]
Simon Byrne and Mark Girolami. 2013. Gedodesic Monte Carlo on Embedded Manifolds. Scandinavian Journal of Statistics 40 (2013), 825–845.
[7]
Xinghao Ding, Lihan He, and Lawrence Carin. 2011. Bayesian Robust Principal Component Analysis. IEEE Transactions on Image Processing 20, 12 (2011), 3419–3430. https://doi.org/10.1109/TIP.2011.2156801
[8]
Ke-Lin Du, M. N. S. Swamy, Zhang-Quan Wang, and Wai Ho Mow. 2023. Matrix Factorization Techniques in Machine Learning, Signal Processing, and Statistics. Mathematics 11, 12 (2023). https://doi.org/10.3390/math11122674
[9]
Simon Duane, A.D. Kennedy, Brian J. Pendleton, and Duncan Roweth. 1987. Hybrid Monte Carlo. Physics Letters B 195, 2 (1987), 216–222. https://doi.org/10.1016/0370-2693(87)91197-X
[10]
Morris L. Eaton. 1989. Group Invariance Applications in Statistics. Regional Conference Series in Probability and Statistics, Vol. 1. Institute of Mathematical Statistics.
[11]
Vivek Farias, Andrew A. Li, and Tianyi Peng. 2022. Uncertainty Quantification for Low-Rank Matrix Completion with Heterogeneous and Sub-Exponential Noise. In Proceedings of The 25th International Conference on Artificial Intelligence and Statistics(Proceedings of Machine Learning Research, Vol. 151), Gustau Camps-Valls, Francisco J. R. Ruiz, and Isabel Valera (Eds.). PMLR, 1179–1189.
[12]
Peter D. Hoff. 2009. Simulation of the Matrix Bingham-von Mises-Fisher Distribution, With Applications to Multivariate and Relational Data. Journal of Computational and Graphical Statistics 18, 2 (2009), 438–456.
[13]
Matthew D. Hoffman and Andrew Gelman. 2014. The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research 15 (2014), 1351–1381.
[14]
Michael Jauch, Peter D. Hoff, and David B. Dunson. 2020. Random Orthogonal Matrices and the Cayley Transform. Bernoulli 26, 2 (2020), 1560–1586. https://doi.org/10.3150/19-BEJ1176
[15]
Michael Jauch, Peter D. Hoff, and David B. Dunson. 2021. Monte Carlo Simulation on the Stiefel Manifold via Polar Expansion. Journal of Computational and Graphical Statistics 30, 3 (2021), 622–631. https://doi.org/10.1080/10618600.2020.1859382 arXiv:https://doi.org/10.1080/10618600.2020.1859382
[16]
Lizhen Lin, Vinayak Rao, and David Dunson. 2017. Bayesian Nonparametric Inference on the Stiefel Manifold. Statistica Sinica 27, 2 (2017), 535–553.
[17]
Jun S. Liu. 2004. Monte Carlo Strategies in Scientific Computing. Springer.
[18]
Hedibert Freitas Lopes. 2014. Modern Bayesian Factor Analysis. In Bayesian Inference in the Social Sciences, Ivan Jeliazkov and Xin-She Yang (Eds.). Wiley Online Library, Chapter 5, 115–153.
[19]
The Tien Mai and Pierre Alquier. 2015. A Bayesian Approach for Noisy Matrix Completion: Optimal Rate under General Sampling Distribution. Electronic Journal of Statistics 9, 1 (2015), 823 – 841. https://doi.org/10.1214/15-EJS1020
[20]
Kevin P. Murphy. 2022. Probabilistic Machine Learning: An Introduction. The MIT Press.
[21]
Radford Neal. 2011. MCMC Using Hamiltonian Dynamics. In Handbook of Markov Chain Monte Carlo, Steve Brooks, Andrew Gelman, Galin L. Jones, and Xio-Li Meng (Eds.). Chapman & Hall/CRC, 113–162.
[22]
Rachel C. Nethery, Nina Katz-Christy, Marianthi-Anna Kioumourtzoglou, Robbie M. Parks, Andrea Schumacher, and G. Brooke Anderson. 2021. Integrated Causal-predictive Machine Learning Models for Tropical Cyclone Epidemiology. Biostatistics 24, 2 (2021), 449–464. https://doi.org/10.1093/biostatistics/kxab047
[23]
Rajbir Nirwan and Nils Bertschinger. 2019. Rotation Invariant Householder Parameterization for Bayesian PCA. In Proceedings of the 36th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 4820–4828. https://proceedings.mlr.press/v97/nirwan19a.html
[24]
Subhadip Pal, Subhajit Sengupta, Riten Mitra, and Arunava Banerjee. 2020. Conjugate Priors and Posterior Inference for the Matrix Langevin Distribution on the Stiefel Manifold. Bayesian Analysis 15, 3 (2020), 871 – 908. https://doi.org/10.1214/19-BA1176
[25]
Xun Pang, Licheng Liu, and Yiqing Xu. 2022. A Bayesian Alternative to Synthetic Control for Comparative Case Studies. Political Analysis 30, 2 (2022), 269–288. https://doi.org/10.1017/pan.2021.22
[26]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning research 12 (2011), 2825–2830.
[27]
Arya A. Pourzanjani, Richard M. Jiang, Brian Mitchell, Paul J. Atzberger, and Linda R. Petzold. 2021. Bayesian Inference over the Stiefel Manifold via the Givens Representation. Bayesian Analysis 16, 2 (2021), 639–666. https://doi.org/10.1214/20-BA1202
[28]
R Core Team. 2021. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
[29]
Russ Salakhutdinov and Andriy Mnih. 2007. Probabilistic Matrix Factorization. In Advances in Neural Information Processing Systems, J. Platt, D. Koller, Y. Singer, and S. Roweis (Eds.). Vol. 20. Curran Associates, Inc.
[30]
Pantelis Samartsidis, Shaun R. Seaman, Abbie Harrison, Angelos Alexopoulos, Gareth J. Hughes, Christopher Rawlinson, Charlotte Anderson, André Charlett, Isabel Oliver, and Daniela De Angelis. 2024+. A Bayesian Multivariate Factor Analysis Model for Causal Inference Using Time-series Observational Data on Mixed Outcomes. Biostatistics (2024+).
[31]
Pantelis Samartsidis, Shaun R. Seaman, Silvia Montagna, André Charlett, Matthew Hickman, and Daniela De Angelis. 2020. A Bayesian Multivariate Factor Analysis Model for Evaluating an Intervention by Using Observational Time Series Data on Multiple Outcomes. Journal of the Royal Statistical Society Series A: Statistics in Society 183, 4 (05 2020), 1437–1459. https://doi.org/10.1111/rssa.12569
[32]
Ron Shepard, Scott R. Brozell, and Gergely Gidofalvi. 2015. The Representation and Parametrization of Orthogonal Matrices. The Journal of Physical Chemistry A 119, 28 (2015), 7924–7939. https://doi.org/10.1021/acs.jpca.5b02015 25946418.
[33]
Jiarong Shi, Xiuyun Zheng, and Wei Yang. 2017. Survey on Probabilistic Models of Low-Rank Matrix Factorizations. Entropy 19, 8 (2017), 424. https://doi.org/10.3390/e19080424
[34]
Masahiro Tanaka. 2021. Bayesian Matrix Completion Approach to Causal Inference with Panel Data. Journal of Statistical Theory and Practice 15 (2021), 49. https://doi.org/10.1007/s42519-021-00188-x
[35]
Masahiro Tanaka. 2022. Bayesian Singular Value Regularization via a Cumulative Shrinkage Process. Communications in Statistics - Theory and Methods 51, 16 (2022), 5566–5589. https://doi.org/10.1080/03610926.2020.1843055
[36]
Michael E. Tipping and Christopher M. Bishop. 2002. Probabilistic Principal Component Analysis. Journal of the Royal Statistical Society Series B: Statistical Methodology 61, 3 (2002), 611–622. https://doi.org/10.1111/1467-9868.00196
[37]
Henry Shaowu Yuchi, Simon Mak, and Yao Xie. 2023. Bayesian Uncertainty Quantification for Low-Rank Matrix Completion. Bayesian Analysis 18, 2 (2023), 491 – 518. https://doi.org/10.1214/22-BA1317
[38]
Ruoshui Zhai and Roee Gutman. 2023. A Bayesian Singular Value Decomposition Procedure for Missing Data Imputation. Journal of Computational and Graphical Statistics 32, 2 (2023), 470–482. https://doi.org/10.1080/10618600.2022.2107534

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CACML '24: Proceedings of the 2024 3rd Asia Conference on Algorithms, Computing and Machine Learning
March 2024
478 pages
ISBN:9798400716416
DOI:10.1145/3654823
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 May 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Markov chain Monte Carlo
  2. Stiefel manifold
  3. orthogonal matrix
  4. parametrization

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

CACML 2024

Acceptance Rates

Overall Acceptance Rate 93 of 241 submissions, 39%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 12
    Total Downloads
  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)4
Reflects downloads up to 06 Nov 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media