Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks

Zhang, Ziming; Brand, Matthew

Statistics > Machine Learning

arXiv:1711.07354 (stat)

[Submitted on 20 Nov 2017]

Title:Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks

Authors:Ziming Zhang, Matthew Brand

View PDF

Abstract:By lifting the ReLU function into a higher dimensional space, we develop a smooth multi-convex formulation for training feed-forward deep neural networks (DNNs). This allows us to develop a block coordinate descent (BCD) training algorithm consisting of a sequence of numerically well-behaved convex optimizations. Using ideas from proximal point methods in convex analysis, we prove that this BCD algorithm will converge globally to a stationary point with R-linear convergence rate of order one. In experiments with the MNIST database, DNNs trained with this BCD algorithm consistently yielded better test-set error rates than identical DNN architectures trained via all the stochastic gradient descent (SGD) variants in the Caffe toolbox.

Comments:	NIPS 2017
Subjects:	Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:1711.07354 [stat.ML]
	(or arXiv:1711.07354v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1711.07354

Submission history

From: Ziming Zhang [view email]
[v1] Mon, 20 Nov 2017 15:04:45 UTC (91 KB)

Full-text links:

Access Paper:

view license

Current browse context:

stat.ML

< prev | next >

new | recent | 2017-11

Change to browse by:

cs
cs.CV
cs.LG
stat

References & Citations

export BibTeX citation

Statistics > Machine Learning

Title:Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators