Tractable structured natural-gradient descent using local parameterizations

Wu Lin, Frank Nielsen, Khan Mohammad Emtiyaz, Mark Schmidt
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:6680-6691, 2021.

Abstract

Natural-gradient descent (NGD) on structured parameter spaces (e.g., low-rank covariances) is computationally challenging due to difficult Fisher-matrix computations. We address this issue by using \emph{local-parameter coordinates} to obtain a flexible and efficient NGD method that works well for a wide-variety of structured parameterizations. We show four applications where our method (1) generalizes the exponential natural evolutionary strategy, (2) recovers existing Newton-like algorithms, (3) yields new structured second-order algorithms, and (4) gives new algorithms to learn covariances of Gaussian and Wishart-based distributions. We show results on a range of problems from deep learning, variational inference, and evolution strategies. Our work opens a new direction for scalable structured geometric methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-lin21e, title = {Tractable structured natural-gradient descent using local parameterizations}, author = {Lin, Wu and Nielsen, Frank and Emtiyaz, Khan Mohammad and Schmidt, Mark}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {6680--6691}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/lin21e/lin21e.pdf}, url = {https://proceedings.mlr.press/v139/lin21e.html}, abstract = {Natural-gradient descent (NGD) on structured parameter spaces (e.g., low-rank covariances) is computationally challenging due to difficult Fisher-matrix computations. We address this issue by using \emph{local-parameter coordinates} to obtain a flexible and efficient NGD method that works well for a wide-variety of structured parameterizations. We show four applications where our method (1) generalizes the exponential natural evolutionary strategy, (2) recovers existing Newton-like algorithms, (3) yields new structured second-order algorithms, and (4) gives new algorithms to learn covariances of Gaussian and Wishart-based distributions. We show results on a range of problems from deep learning, variational inference, and evolution strategies. Our work opens a new direction for scalable structured geometric methods.} }
Endnote
%0 Conference Paper %T Tractable structured natural-gradient descent using local parameterizations %A Wu Lin %A Frank Nielsen %A Khan Mohammad Emtiyaz %A Mark Schmidt %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-lin21e %I PMLR %P 6680--6691 %U https://proceedings.mlr.press/v139/lin21e.html %V 139 %X Natural-gradient descent (NGD) on structured parameter spaces (e.g., low-rank covariances) is computationally challenging due to difficult Fisher-matrix computations. We address this issue by using \emph{local-parameter coordinates} to obtain a flexible and efficient NGD method that works well for a wide-variety of structured parameterizations. We show four applications where our method (1) generalizes the exponential natural evolutionary strategy, (2) recovers existing Newton-like algorithms, (3) yields new structured second-order algorithms, and (4) gives new algorithms to learn covariances of Gaussian and Wishart-based distributions. We show results on a range of problems from deep learning, variational inference, and evolution strategies. Our work opens a new direction for scalable structured geometric methods.
APA
Lin, W., Nielsen, F., Emtiyaz, K.M. & Schmidt, M.. (2021). Tractable structured natural-gradient descent using local parameterizations. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:6680-6691 Available from https://proceedings.mlr.press/v139/lin21e.html.

Related Material