skip to main content
10.5555/3045390.3045671guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Gromov-wasserstein averaging of kernel and distance matrices

Published: 19 June 2016 Publication History

Abstract

This paper presents a new technique for computing the barycenter of a set of distance or kernel matrices. These matrices, which define the interrelationships between points sampled from individual domains, are not required to have the same size or to be in row-by-row correspondence. We compare these matrices using the softassign criterion, which measures the minimum distortion induced by a probabilistic map from the rows of one similarity matrix to the rows of another; this criterion amounts to a regularized version of the Gromov-Wasserstein (GW) distance between metric-measure spaces. The barycenter is then defined as a Fréchet mean of the input matrices with respect to this criterion, minimizing a weighted sum of softassign values. We provide a fast iterative algorithm for the resulting nonconvex optimization problem, built upon state-of-the-art tools for regularized optimal transportation. We demonstrate its application to the computation of shape barycenters and to the prediction of energy levels from molecular configurations in quantum chemistry.

References

[1]
Aflalo, Yonathan, Bronstein, Alexander, and Kimmel, Ron. On convex relaxation of graph isomorphism. Proc. National Academy of Sci., 112(10):2942-2947, 2015.
[2]
Agueh, Martial and Carlier, Guillaume. Barycenters in the Wasserstein space. SIAM Journal on Mathematical Analysis, 43(2):904-924, 2011.
[3]
Arthur, David and Vassilvitskii, Sergei. K-means++: The advantages of careful seeding. In Proc. SODA, pp. 1027-1035, 2007.
[4]
Benamou, Jean-David, Carlier, Guillaume, Cuturi, Marco, Nenna, Luca, and Peyré, Gabriel. Iterative Bregman projections for regularized transportation problems. SIAM J. Sci. Comp., 37 (2):A1111-A1138, 2015.
[5]
Bot, Radu Ioan, Csetnek, Ernö Robert, and László, Szilárd Csaba. An inertial forward-backward algorithm for the minimization of the sum of two nonconvex functions. EURO J. Comp. Optim., pp. 1-23, 2015.
[6]
Courty, Nicolas, Flamary, Rémi, and Tuia, Devis. Domain adaptation with regularized optimal transport. In Machine Learning and Knowledge Discovery in Databases, pp. 274-289. 2014.
[7]
Cuturi, Marco. Sinkhorn distances: Lightspeed computation of optimal transportation. In Proc. NIPS, volume 26, pp. 2292- 2300. 2013.
[8]
Cuturi, Marco and Doucet, Arnaud. Fast computation of Wasserstein barycenters. In Proc. ICML, volume 32, 2014.
[9]
Dryden, Ian L, Koloydenko, Alexey, and Zhou, Diwei. Non-Euclidean statistics for covariance matrices, with applications to diffusion tensor imaging. The Annals of Applied Statistics, pp. 1102-1123, 2009.
[10]
Elad, Asi and Kimmel, Ron. On bending invariant signatures for surfaces. IEEE Tr. on PAMI, 25(10):1285-1295, 2003.
[11]
Frogner, Charlie, Zhang, Chiyuan, Mobahi, Hossein, Araya, Mauricio, and Poggio, Tomaso. Learning with a Wasserstein loss. In Advances in Neural Information Processing Systems, volume 28, pp. 2044-2052. 2015.
[12]
Gold, Steven and Rangarajan, Anand. A graduated assignment algorithm for graph matching. PAMI, 18(4):377-388, April 1996.
[13]
Gromov, Mikhail. Metric Structures for Riemannian and Non-Riemannian Spaces. Progress in Math. Birkhäuser, 2001.
[14]
Hansen, Katja, Montavon, Grégoire, Biegler, Franziska, Fazli, Siamac, Rupp, Matthias, Scheffler, Matthias, Von Lilienfeld, O Anatole, Tkatchenko, Alexandre, and Mller, Klaus-Robert. Assessment and validation of machine learning methods for predicting molecular atomization energies. Journal of Chemical Theory and Computation, 9(8):3404-3419, 2013.
[15]
Hendrikson, Reigo. Using Gromov-Wasserstein distance to explore sets of networks. In University of Tartu, Master Thesis, 2016.
[16]
Kezurer, Itay, Kovalsky, Shahar Z., Basri, Ronen, and Lipman, Yaron. Tight relaxation of quadratic matching. CGF, 2015.
[17]
LeCun, Yann, Bottou, Léon, Bengio, Yoshua, and Haffner, Patrick. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324, 1998.
[18]
Loiola, Eliane Maria, de Abreu, Nair Maria Maia, Boaventura-Netto, Paulo Oswaldo, Hahn, Peter, and Querido, Tania. A survey for the quadratic assignment problem. European J. Operational Research, 176(2):657-690, 2007.
[19]
Manning, Christopher D., Raghavan, Prabhakar, and Schütze, Hinrich. Introduction to Information Retrieval. Cambridge University Press, 2008.
[20]
Mémoli, Facundo. On the use of Gromov-Hausdorff distances for shape comparison. In Symposium on Point Based Graphics, pp. 81-90. 2007.
[21]
Mémoli, Facundo. Gromov-Wasserstein distances and the metric approach to object matching. Foundations of Comp. Math., 11 (4):417-487, 2011.
[22]
Nielsen, Frank and Bhatia, Rajendra. Matrix Information Geometry. Springer, 2012.
[23]
Rangarajan, Anand, Yuille, Alan, and Mjolsness, Eric. Convergence properties of the softassign quadratic assignment algorithm. Neural Comput., 11(6):1455-1474, August 1999.
[24]
Rolet, Antoine, Cuturi, Marco, and Peyré, Gabriel. Fast dictionary learning with a smoothed Wasserstein loss. In Proc. AISTATS' 16, 2016.
[25]
Rupp, Matthias, Tkatchenko, Alexandre, Müller, Klaus-Robert, and Von Lilienfeld, O Anatole. Fast and accurate modeling of molecular atomization energies with machine learning. Physical review letters, 108(5):058301, 2012.
[26]
Seguy, Vivien and Cuturi, Marco. Principal geodesic analysis for probability measures under the optimal transport metric. In Advances in Neural Information Processing Systems, pp. 3294-3302, 2015.
[27]
Solomon, Justin, Rustamov, Raif, Guibas, Leonidas, and Butscher, Adrian. Earth mover's distances on discrete surfaces. ACM Trans. Graph., 33(4):67:1-67:12, July 2014.
[28]
Solomon, Justin, Peyré, Gabriel, Kim, Vladimir, and Sra, Suvrit. Entropic metric alignment for correspondence problems. ACM Transactions on Graphics (TOG), 35(4), 2016.
[29]
Sra, Suvrit. Positive definite matrices and the S-divergence. arXiv preprint arXiv:1110.1773, 2011.
[30]
Sturm, Karl-Theodor. The space of spaces: curvature bounds and gradient flows on the space of metric measure spaces. Preprint 1208.0434, arXiv, 2012.
[31]
Thakoor, Ninad, Gao, Jean, and Jung, Sungyong. Hidden Markov model-based weighted likelihood discriminant for 2-D shape classification. Trans. Image Proc., 16(11):2707-2719, 2007.
[32]
Villani, Cedric. Topics in Optimal Transportation. Graduate studies in Math. AMS, 2003.

Cited By

View all
  1. Gromov-wasserstein averaging of kernel and distance matrices

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    ICML'16: Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48
    June 2016
    3077 pages

    Publisher

    JMLR.org

    Publication History

    Published: 19 June 2016

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 25 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media