Characterization of Gradient Dominance and Regularity Conditions for Neural Networks

Zhou, Yi; Liang, Yingbin

Statistics > Machine Learning

arXiv:1710.06910 (stat)

[Submitted on 18 Oct 2017 (v1), last revised 20 Oct 2017 (this version, v2)]

Title:Characterization of Gradient Dominance and Regularity Conditions for Neural Networks

Authors:Yi Zhou, Yingbin Liang

View PDF

Abstract:The past decade has witnessed a successful application of deep learning to solving many challenging problems in machine learning and artificial intelligence. However, the loss functions of deep neural networks (especially nonlinear networks) are still far from being well understood from a theoretical aspect. In this paper, we enrich the current understanding of the landscape of the square loss functions for three types of neural networks. Specifically, when the parameter matrices are square, we provide an explicit characterization of the global minimizers for linear networks, linear residual networks, and nonlinear networks with one hidden layer. Then, we establish two quadratic types of landscape properties for the square loss of these neural networks, i.e., the gradient dominance condition within the neighborhood of their full rank global minimizers, and the regularity condition along certain directions and within the neighborhood of their global minimizers. These two landscape properties are desirable for the optimization around the global minimizers of the loss function for these neural networks.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Optimization and Control (math.OC)
Cite as:	arXiv:1710.06910 [stat.ML]
	(or arXiv:1710.06910v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1710.06910

Submission history

From: Yi Zhou [view email]
[v1] Wed, 18 Oct 2017 19:53:57 UTC (37 KB)
[v2] Fri, 20 Oct 2017 14:49:30 UTC (37 KB)

Statistics > Machine Learning

Title:Characterization of Gradient Dominance and Regularity Conditions for Neural Networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Characterization of Gradient Dominance and Regularity Conditions for Neural Networks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators