Article

Gromov-wasserstein averaging of kernel and distance matrices

Authors:

Gabriel Peyré,

Justin SolomonAuthors Info & Claims

ICML'16: Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48

Pages 2664 - 2672

Published: 19 June 2016 Publication History

Abstract

This paper presents a new technique for computing the barycenter of a set of distance or kernel matrices. These matrices, which define the interrelationships between points sampled from individual domains, are not required to have the same size or to be in row-by-row correspondence. We compare these matrices using the softassign criterion, which measures the minimum distortion induced by a probabilistic map from the rows of one similarity matrix to the rows of another; this criterion amounts to a regularized version of the Gromov-Wasserstein (GW) distance between metric-measure spaces. The barycenter is then defined as a Fréchet mean of the input matrices with respect to this criterion, minimizing a weighted sum of softassign values. We provide a fast iterative algorithm for the resulting nonconvex optimization problem, built upon state-of-the-art tools for regularized optimal transportation. We demonstrate its application to the computation of shape barycenters and to the prediction of energy levels from molecular configurations in quantum chemistry.

References

[1]

Aflalo, Yonathan, Bronstein, Alexander, and Kimmel, Ron. On convex relaxation of graph isomorphism. Proc. National Academy of Sci., 112(10):2942-2947, 2015.

[2]

Agueh, Martial and Carlier, Guillaume. Barycenters in the Wasserstein space. SIAM Journal on Mathematical Analysis, 43(2):904-924, 2011.

[3]

Arthur, David and Vassilvitskii, Sergei. K-means++: The advantages of careful seeding. In Proc. SODA, pp. 1027-1035, 2007.

[4]

Benamou, Jean-David, Carlier, Guillaume, Cuturi, Marco, Nenna, Luca, and Peyré, Gabriel. Iterative Bregman projections for regularized transportation problems. SIAM J. Sci. Comp., 37 (2):A1111-A1138, 2015.

[5]

Bot, Radu Ioan, Csetnek, Ernö Robert, and László, Szilárd Csaba. An inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions. EURO J. Comp. Optim., pp. 1-23, 2015.

[6]

Courty, Nicolas, Flamary, Rémi, and Tuia, Devis. Domain adaptation with regularized optimal transport. In Machine Learning and Knowledge Discovery in Databases, pp. 274-289. 2014.

[7]

Cuturi, Marco. Sinkhorn distances: Lightspeed computation of optimal transportation. In Proc. NIPS, volume 26, pp. 2292- 2300. 2013.

[8]

Cuturi, Marco and Doucet, Arnaud. Fast computation of Wasserstein barycenters. In Proc. ICML, volume 32, 2014.

[9]

Dryden, Ian L, Koloydenko, Alexey, and Zhou, Diwei. Non-Euclidean statistics for covariance matrices, with applications to diffusion tensor imaging. The Annals of Applied Statistics, pp. 1102-1123, 2009.

[10]

Elad, Asi and Kimmel, Ron. On bending invariant signatures for surfaces. IEEE Tr. on PAMI, 25(10):1285-1295, 2003.

[11]

Frogner, Charlie, Zhang, Chiyuan, Mobahi, Hossein, Araya, Mauricio, and Poggio, Tomaso. Learning with a Wasserstein loss. In Advances in Neural Information Processing Systems, volume 28, pp. 2044-2052. 2015.

[12]

Gold, Steven and Rangarajan, Anand. A graduated assignment algorithm for graph matching. PAMI, 18(4):377-388, April 1996.

[13]

Gromov, Mikhail. Metric Structures for Riemannian and Non-Riemannian Spaces. Progress in Math. Birkhäuser, 2001.

[14]

Hansen, Katja, Montavon, Grégoire, Biegler, Franziska, Fazli, Siamac, Rupp, Matthias, Scheffler, Matthias, Von Lilienfeld, O Anatole, Tkatchenko, Alexandre, and Mller, Klaus-Robert. Assessment and validation of machine learning methods for predicting molecular atomization energies. Journal of Chemical Theory and Computation, 9(8):3404-3419, 2013.

[15]

Hendrikson, Reigo. Using Gromov-Wasserstein distance to explore sets of networks. In University of Tartu, Master Thesis, 2016.

[16]

Kezurer, Itay, Kovalsky, Shahar Z., Basri, Ronen, and Lipman, Yaron. Tight relaxation of quadratic matching. CGF, 2015.

[17]

LeCun, Yann, Bottou, Léon, Bengio, Yoshua, and Haffner, Patrick. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324, 1998.

[18]

Loiola, Eliane Maria, de Abreu, Nair Maria Maia, Boaventura-Netto, Paulo Oswaldo, Hahn, Peter, and Querido, Tania. A survey for the quadratic assignment problem. European J. Operational Research, 176(2):657-690, 2007.

[19]

Manning, Christopher D., Raghavan, Prabhakar, and Schütze, Hinrich. Introduction to Information Retrieval. Cambridge University Press, 2008.

[20]

Mémoli, Facundo. On the use of Gromov-Hausdorff distances for shape comparison. In Symposium on Point Based Graphics, pp. 81-90. 2007.

[21]

Mémoli, Facundo. Gromov-Wasserstein distances and the metric approach to object matching. Foundations of Comp. Math., 11 (4):417-487, 2011.

[22]

Nielsen, Frank and Bhatia, Rajendra. Matrix Information Geometry. Springer, 2012.

[23]

Rangarajan, Anand, Yuille, Alan, and Mjolsness, Eric. Convergence properties of the softassign quadratic assignment algorithm. Neural Comput., 11(6):1455-1474, August 1999.

[24]

Rolet, Antoine, Cuturi, Marco, and Peyré, Gabriel. Fast dictionary learning with a smoothed Wasserstein loss. In Proc. AISTATS' 16, 2016.

[25]

Rupp, Matthias, Tkatchenko, Alexandre, Müller, Klaus-Robert, and Von Lilienfeld, O Anatole. Fast and accurate modeling of molecular atomization energies with machine learning. Physical review letters, 108(5):058301, 2012.

[26]

Seguy, Vivien and Cuturi, Marco. Principal geodesic analysis for probability measures under the optimal transport metric. In Advances in Neural Information Processing Systems, pp. 3294-3302, 2015.

[27]

Solomon, Justin, Rustamov, Raif, Guibas, Leonidas, and Butscher, Adrian. Earth mover's distances on discrete surfaces. ACM Trans. Graph., 33(4):67:1-67:12, July 2014.

[28]

Solomon, Justin, Peyré, Gabriel, Kim, Vladimir, and Sra, Suvrit. Entropic metric alignment for correspondence problems. ACM Transactions on Graphics (TOG), 35(4), 2016.

[29]

Sra, Suvrit. Positive definite matrices and the S-divergence. arXiv preprint arXiv:1110.1773, 2011.

[30]

Sturm, Karl-Theodor. The space of spaces: curvature bounds and gradient flows on the space of metric measure spaces. Preprint 1208.0434, arXiv, 2012.

[31]

Thakoor, Ninad, Gao, Jean, and Jung, Sungyong. Hidden Markov model-based weighted likelihood discriminant for 2-D shape classification. Trans. Image Proc., 16(11):2707-2719, 2007.

[32]

Villani, Cedric. Topics in Optimal Transportation. Graduate studies in Math. AMS, 2003.

Cited By

Wu SLiu ZLu SCheng LShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)Dual Learning Music Composition and Dance ChoreographyProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475180(3746-3754)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3475180
Zareapoor MYang J(2021)Equivariant Adversarial Network for Image-to-image TranslationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/345828017:2s(1-14)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1145/3458280
Redko IVayer TFlamary RCourty NLarochelle HRanzato MHadsell RBalcan MLin H(2020)CO-optimal transportProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3497197(17559-17570)Online publication date: 6-Dec-2020
https://dl.acm.org/doi/10.5555/3495724.3497197
Show More Cited By

Gromov-wasserstein averaging of kernel and distance matrices
1. Computing methodologies

Recommendations

The Ultrametric Gromov–Wasserstein Distance
Abstract
We investigate compact ultrametric measure spaces which form a subset $U^{w}$ of the collection of all metric measure spaces $M^{w}$ . In analogy with the notion of the ultrametric Gromov–Hausdorff distance on the collection of ultrametric spaces $U$ , we ... $^{}$
Wasserstein distance for OWA operators
Abstract
We suggest a distance measure for OWA operators. First we associate an OWA operator with a unique regular increasing monotone quantifier and then define the distance between two OWA operators as the Wasserstein-1 distance between their associated ...
Sampled Gromov Wasserstein
Abstract
Optimal Transport (OT) has proven to be a powerful tool to compare probability distributions in machine learning, but dealing with probability measures lying in different spaces remains an open problem. To address this issue, the Gromov ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

ICML'16: Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48

June 2016

3077 pages

Publisher

JMLR.org

Publication History

Published: 19 June 2016

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 25 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wu SLiu ZLu SCheng LShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)Dual Learning Music Composition and Dance ChoreographyProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475180(3746-3754)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3475180
Zareapoor MYang J(2021)Equivariant Adversarial Network for Image-to-image TranslationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/345828017:2s(1-14)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1145/3458280
Redko IVayer TFlamary RCourty NLarochelle HRanzato MHadsell RBalcan MLin H(2020)CO-optimal transportProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3497197(17559-17570)Online publication date: 6-Dec-2020
https://dl.acm.org/doi/10.5555/3495724.3497197
Yan YLi WWu HMin HTan MWu Q(2018)Semi-supervised optimal transport for heterogeneous domain adaptationProceedings of the 27th International Joint Conference on Artificial Intelligence10.5555/3304889.3305073(2969-2975)Online publication date: 13-Jul-2018
https://dl.acm.org/doi/10.5555/3304889.3305073
Laclau CRedko IMatei BBennani YBrault V(2017)Co-clustering through Optimal TransportProceedings of the 34th International Conference on Machine Learning - Volume 7010.5555/3305381.3305583(1955-1964)Online publication date: 6-Aug-2017
https://dl.acm.org/doi/10.5555/3305381.3305583
Ezuz DSolomon JKim VBen-Chen M(2017)GWCNNComputer Graphics Forum10.1111/cgf.1324436:5(49-57)Online publication date: 1-Aug-2017
https://dl.acm.org/doi/10.1111/cgf.13244

View Options

View options

Figures

Tables

Media

View Table of Conten