article

Learning with optimal interpolation norms

Authors:

Patrick L. Combettes,

Andrew M. Mcdonald,

Charles A. Micchelli,

Massimiliano PontilAuthors Info & Claims

Numerical Algorithms, Volume 81, Issue 2

Pages 695 - 717

https://doi.org/10.1007/s11075-018-0568-1

Published: 01 June 2019 Publication History

Abstract

We analyze a class of norms defined via an optimal interpolation problem involving the composition of norms and a linear operator. This construction, known as infimal postcomposition in convex analysis, is shown to encompass various norms which have been used as regularizers in machine learning, signal processing, and statistics. In particular, these include the latent group lasso, the overlapping group lasso, and certain norms used for learning tensors. We establish basic properties of this class of norms and we provide dual norms. The extension to more general classes of convex functions is also discussed. A stochastic block-coordinate version of the Douglas-Rachford algorithm is devised to solve minimization problems involving these regularizers. A prominent feature of the algorithm is that it yields iterates that converge to a solution in the case of nonsmooth losses and random block updates. Finally, we present numerical experiments with problems employing the latent group lasso penalty.

References

[1]

Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Mach. Learn. 73, 243---272 (2008)

Digital Library

[2]

Argyriou, A., Foygel, R., Srebro, N.: Sparse prediction with the K-support norm. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1466---1474 (2012)

Digital Library

[3]

Argyriou, A., Micchelli, C.A., Pontil, M.: On spectral learning. J. Mach. Learn. Res. 11, 935---953 (2010)

Digital Library

[4]

Asaei, A., Golbabaee, M., Bourlard, H., Cevher, V.: Structured sparsity models for reverberant speech separation. IEEE/ACM Trans. Audio Speech Language Process. 22, 620---633 (2014)

Digital Library

[5]

Bach, F.R., Jenatton, R., Mairal, J., Obozinski, G.: Optimization with sparsity-inducing penalties. Found. Trends Mach. Learn. 4, 1---106 (2012)

Digital Library

[6]

Bach, F.R., Lanckriet, G.R.G., Jordan, M.I.: Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceedings of the 21st International Conference on Machine Learning, pp. 6---15 (2004)

Digital Library

[7]

Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd edn. Springer, New York (2017)

Digital Library

[8]

Becker, S.R., Combettes, P.L.: An algorithm for splitting parallel sums of linearly composed monotone operators, with applications to signal recovery. J. Nonlin. Convex Anal. 15, 137---159 (2014)

[9]

Bourrier, A., Davies, M.E., Peleg, T., Pérez, P., Gribonval, R.: Fundamental performance limits for ideal decoders in high-dimensional linear inverse problems. IEEE Trans. Inform. Theory 60, 7928---7947 (2014)

[10]

Cheney, W., Light, W.: A Course in Approximation Theory. American Mathematical Society, Providence (2000)

[11]

Chandrasekaran, V., Recht, B., Parrilo, P.A., Willsky, A.: The convex geometry of linear inverse problems. Found. Comput. Math. 12, 805---849 (2012)

Digital Library

[12]

Combettes, P.L.: Systems of structured monotone inclusions: duality, algorithms, and applications. SIAM J. Optim. 23, 2420---2447 (2013)

[13]

Combettes, P.L., Pesquet, J.-C.: Stochastic quasi-Fejér block-coordinate fixed point iterations with random sweeping. SIAM J. Optim. 25, 1221---1248 (2015)

[14]

Combettes, P.L., Pesquet, J.-C.: Stochastic quasi-Fejér block-coordinate fixed point iterations with random sweeping II: Mean-square and linear convergence, Math. Programming, published on line 2018-05-26

Digital Library

[15]

Gandy, S., Recht, B., Yamada, I.: Tensor completion and low-n-rank tensor recovery via convex optimization, Inverse Problems, vol. 27, art 025010 (2011)

[16]

Goulaouic, C.: Prolongements de foncteurs d'interpolation et applications. Ann. Inst. Fourier 18, 1---98 (1968)

[17]

Herbster, M., Lever, G.: Predicting the labelling of a graph via minimum p-seminorm interpolation. In: Proceedings of the 22nd Annual Conference on Learning Theory (2009)

[18]

Jacob, L., Obozinski, G., Vert, J.-P.: Group lasso with overlap and graph lasso. In: Proceedings of the 26th International Conference on Machine Learning, pp. 433---440 (2009)

Digital Library

[19]

Jaggi, M., Sulovsky, M.: A simple algorithm for nuclear norm regularized problems. In: Proceedings of the 27th International Conference on Machine Learning, pp. 471---478 (2010)

Digital Library

[20]

Jenatton, R., Mairal, J., Obozinski, G., Bach, F.: Proximal methods for hierarchical sparse coding. J. Mach. Learn. Res. 12, 2297---2334 (2011)

Digital Library

[21]

Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev. 51, 455---500 (2009)

Digital Library

[22]

Liu, X., Zhao, G., Yao, J., Qi, C.: Background subtraction based on low-rank and structured sparse decomposition. IEEE Trans. Image Process. 24, 2502---2514 (2015)

[23]

Mallat, S., Yu, G.: Super-resolution with sparse mixing estimators. IEEE Trans. Image Process. 19, 2889---2900 (2010)

Digital Library

[24]

Maurer, A., Pontil, M.: Structured sparsity and generalization. J. Mach. Learn. Res. 13, 671---690 (2012)

Digital Library

[25]

McDonald, A.M., Pontil, M., Stamos, D.: Spectral K-support norm regularization. In: Advances in Neural Information Processing Systems, vol. 27, pp. 3644---3652 (2014)

Digital Library

[26]

McDonald, A.M., Pontil, M., Stamos, D.: Fitting spectral decay with the k-support norm. In: Proceedings of the 19th International Conference on Artificial Intelligence Statistic and Machine Learning Research, pp. 1061---1069 (2016)

[27]

McDonald, A.M., Pontil, M., Stamos, D.: New perspectives on k-support and cluster norms. J. Machine Learn. Res. 17, 1---38 (2016)

Digital Library

[28]

Micchelli, C.A., Morales, J.M., Pontil, M.: Regularizers for structured sparsity. Adv. Comput. Math. 38, 455---489 (2013)

Digital Library

[29]

Micchelli, C.A., Pontil, M.: Learning the kernel function via regularization. J. Mach. Learn. Res. 6, 1099---1125 (2005)

Digital Library

[30]

Micchelli, C.A., Pontil, M.: Feature space perspectives for learning the kernel. Machine Learn. 66, 297---319 (2007)

Digital Library

[31]

Micchelli, C.A., Shen, L., Xu, Y.: Proximity algorithms for image models: denoising, inverse problems, vol. 27, art 045009 (2011)

[32]

Peetre, J.: A new approach in interpolation spaces. Studia Math. 34, 23---42 (1970)

[33]

Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)

[34]

Romera-Paredes, B., Aung, H., Bianchi-Berthouze, N., Pontil, M.: Multilinear multitask learning. In: Proceedings of the 30th International Conference on Machine Learning, pp. 1444---1452 (2013)

Digital Library

[35]

Signoretto, M., Dinh, Q.T., De Lathauwer, L., Suykens, J.A.K.: Learning with tensors: a framework based on convex optimization and spectral regularization. Mach. Learn. 94, 303---351 (2014)

Digital Library

[36]

Srebro, N., Rennie, J.D.M., Jaakkola, T.S.: Maximum-margin matrix factorization. In: Advances in Neural Information Processing Systems, vol. 17, pp. 1329---1336 (2005)

Digital Library

[37]

Sun, Y., Tao, X., Li, Y., Lu, J.: Robust 2D principal component analysis: a structured sparsity regularized approach. IEEE Trans. Image Process. 24, 2515---2526 (2015)

[38]

Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Knight, K.: Sparsity and smoothness via the fused lasso. J. Roy. Stat. Soc. B67, 91---108 (2005)

[39]

Tomioka, R., Suzuki, T.: Convex tensor decomposition via structured Schatten norm regularization. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1331---1339 (2013)

Digital Library

[40]

Tomioka, R., Sukuki, T., Hayashi, K., Kashima, H.: Statistical Pperformance of convex tensor decomposition. In: Advances in Neural Information Processing Systems, vol. 23, pp. 972---980 (2011)

Digital Library

[41]

Triebel, H.: Interpolation Theory, Function Spaces, Differential Operators. North-Holland, New York (1978)

[42]

Ward, J.E., Wendell, R.E.: Using block norms for location modeling. Oper. Res. 33, 1074---1090 (1985)

Digital Library

[43]

Wimalawarne, K., Sugiama, M., Tomioka, R.: Multitask learning meets tensor factorization: task imputation via convex optimization. In: Advances in Neural Information Processing Systems, vol. 26, pp. 2825---2833 (2014)

Digital Library

[44]

Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. Roy. Stat. Soc. B68, 49---67 (2006)

[45]

Ză?linescu, C.: Convex Analysis in General Vector Spaces. World Scientific, River Edge (2002)

[46]

Zeng, X., Figueiredo, M.A.T.: Decreasing weighted sorted ℓ1 regularization. IEEE Signal Process. Lett. 21, 1240---1244 (2014)

[47]

Zhang, L., Wei, W., Tian, C., Li, F., Zhang, Y.: Exploring structured sparsity by a reweighted Laplace prior for hyperspectral compressive sensing. IEEE Trans. Image Process. 25, 4974---4988 (2016)

[48]

Zhao, P., Rocha, G., Yu, B.: The composite absolute penalties family for grouped and hierarchical variable selection. Ann. Stat. 37, 3468---3497 (2009)

Cited By

Combettes PPesquet J(2021)Fixed Point Strategies in Data ScienceIEEE Transactions on Signal Processing10.1109/TSP.2021.306967769(3878-3905)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1109/TSP.2021.3069677

Learning with optimal interpolation norms
1. Theory of computation

Recommendations

Operator Splittings, Bregman Methods and Frame Shrinkage in Image Processing

We examine the underlying structure of popular algorithms for variational methods used in image processing. We focus here on operator splittings and Bregman methods based on a unified approach via fixed point iterations and averaged operators. In ...
A Douglas–Rachford Splitting Approach to Compressed Sensing Image Recovery Using Low-Rank Regularization
In this paper, we study the compressed sensing (CS) image recovery problem. The traditional method divides the image into blocks and treats each block as an independent sub-CS recovery task. This often results in losing global structure of an image. In ...
Reshaped tensor nuclear norms for higher order tensor completion
Abstract
We investigate optimal conditions for inducing low-rankness of higher order tensors by using convex tensor norms with reshaped tensors. We propose the reshaped tensor nuclear norm as a generalized approach to reshape tensors to be regularized by ...

Comments

Information & Contributors

Information

Published In

cover image Numerical Algorithms

Numerical Algorithms Volume 81, Issue 2

June 2019

357 pages

ISSN:1017-1398

Issue’s Table of Contents

Copyright © Copyright © 2019 Springer Science+Business Media, LLC, part of Springer Nature.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 June 2019

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Combettes PPesquet J(2021)Fixed Point Strategies in Data ScienceIEEE Transactions on Signal Processing10.1109/TSP.2021.306967769(3878-3905)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1109/TSP.2021.3069677

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents