skip to main content
10.5555/3042817.3043084guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Maxout networks

Published: 16 June 2013 Publication History

Abstract

We consider the problem of designing models to leverage a recently introduced approximate model averaging technique called dropout. We define a simple new model called maxout (so named because its output is the max of a set of inputs, and because it is a natural companion to dropout) designed to both facilitate optimization by dropout and improve the accuracy of dropout's fast approximate model averaging technique. We empirically verify that the model successfully accomplishes both of these tasks. We use maxout and dropout to demonstrate state of the art classification performance on four benchmark datasets: MNIST, CIFAR-10, CIFAR-100, and SVHN.

References

[1]
Bastien, Frédéric, Lamblin, Pascal, Pascanu, Razvan, Bergstra, James, Goodfellow, Ian, Bergeron, Arnaud, Bouchard, Nicolas, and Bengio, Yoshua. Theano: new features and speed improvements. Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop, 2012.
[2]
Bergstra, James, Breuleux, Olivier, Bastien, Frédéric, Lamblin, Pascal, Pascanu, Razvan, Desjardins, Guillaume, Turian, Joseph, Warde-Farley, David, and Bengio, Yoshua. Theano: a CPU and GPU math expression compiler. In Proceedings of the Python for Scientific Computing Conference (SciPy), June 2010. Oral Presentation.
[3]
Breiman, Leo. Bagging predictors. Machine Learning, 24 (2):123-140, 1994.
[4]
Ciresan, D. C., Meier, U., Gambardella, L. M., and Schmidhuber, J. Deep big simple neural nets for handwritten digit recognition. Neural Computation, 22:1-14, 2010.
[5]
Glorot, Xavier, Bordes, Antoine, and Bengio, Yoshua. Deep sparse rectifier neural networks. In JMLR W&CP: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2011), April 2011.
[6]
Goodfellow, Ian J., Courville, Aaron, and Bengio, Yoshua. Joint training of deep Boltzmann machines for classification. In International Conference on Learning Representations: Workshops Track, 2013.
[7]
Hahnloser, Richard H. R. On the piecewise analysis of networks of linear threshold neurons. Neural Networks, 11(4):691-697, 1998.
[8]
Hinton, Geoffrey E., Srivastava, Nitish, Krizhevsky, Alex, Sutskever, Ilya, and Salakhutdinov, Ruslan. Improving neural networks by preventing co-adaptation of feature detectors. Technical report, arXiv:1207.0580, 2012.
[9]
Jarrett, Kevin, Kavukcuoglu, Koray, Ranzato, Marc'Aurelio, and LeCun, Yann. What is the best multi-stage architecture for object recognition? In Proc. International Conference on Computer Vision (ICCV'09), pp. 2146-2153. IEEE, 2009.
[10]
Krizhevsky, Alex and Hinton, Geoffrey. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
[11]
Krizhevsky, Alex, Sutskever, Ilya, and Hinton, Geoffrey. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25 (NIPS'2012). 2012.
[12]
LeCun, Yann, Bottou, Leon, Bengio, Yoshua, and Haffner, Patrick. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324, November 1998.
[13]
Malinowski, Mateusz and Fritz, Mario. Learnable pooling regions for image classification. In International Conference on Learning Representations: Workshop track, 2013.
[14]
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., and Ng, A. Y. Reading digits in natural images with unsupervised feature learning. Deep Learning and Unsupervised Feature Learning Workshop, NIPS, 2011.
[15]
Rifai, Salah, Dauphin, Yann, Vincent, Pascal, Bengio, Yoshua, and Muller, Xavier. The manifold tangent classifier. In NIPS'2011, 2011. Student paper award.
[16]
Salakhutdinov, R. and Hinton, G.E. Deep Boltzmann machines. In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS 2009), volume 8, 2009.
[17]
Salinas, E. and Abbott, L. F. A model of multiplicative neural responses in parietal cortex. Proc Natl Acad Sci U S A, 93(21):11956-11961, October 1996.
[18]
Sermanet, Pierre, Chintala, Soumith, and LeCun, Yann. Convolutional neural networks applied to house numbers digit classification. CoRR, abs/1204.3968, 2012a.
[19]
Sermanet, Pierre, Chintala, Soumith, and LeCun, Yann. Convolutional neural networks applied to house numbers digit classification. In International Conference on Pattern Recognition (ICPR 2012), 2012b.
[20]
Snoek, Jasper, Larochelle, Hugo, and Adams, Ryan Prescott. Practical bayesian optimization of machine learning algorithms. In Neural Information Processing Systems, 2012.
[21]
Srebro, Nathan and Shraibman, Adi. Rank, trace-norm and max-norm. In Proceedings of the 18th Annual Conference on Learning Theory, pp. 545-560. Springer-Verlag, 2005.
[22]
Srivastava, Nitish. Improving neural networks with dropout. Master's thesis, U. Toronto, 2013.
[23]
Wang, Shuning. General constructive representations for continuous piecewise-linear functions. IEEE Trans. Circuits Systems, 51(9):1889-1896, 2004.
[24]
Yu, Dong and Deng, Li. Deep convex net: A scalable architecture for speech pattern classification. In INTERSPEECH, pp. 2285-2288, 2011.
[25]
Zeiler, Matthew D. and Fergus, Rob. Stochastic pooling for regularization of deep convolutional neural networks. In International Conference on Learning Representations, 2013.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICML'13: Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28
June 2013
2534 pages

Publisher

JMLR.org

Publication History

Published: 16 June 2013

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media