skip to main content
10.1145/3489517.3530678acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article
Public Access

HERO: hessian-enhanced robust optimization for unifying and improving generalization and quantization performance

Published: 23 August 2022 Publication History

Abstract

With the recent demand of deploying neural network models on mobile and edge devices, it is desired to improve the model's generalizability on unseen testing data, as well as enhance the model's robustness under fixed-point quantization for efficient deployment. Minimizing the training loss, however, provides few guarantees on the generalization and quantization performance. In this work, we fulfill the need of improving generalization and quantization performance simultaneously by theoretically unifying them under the framework of improving the model's robustness against bounded weight perturbation and minimizing the eigenvalues of the Hessian matrix with respect to model weights. We therefore propose HERO, a Hessian-enhanced robust optimization method, to minimize the Hessian eigenvalues through a gradient-based training process, simultaneously improving the generalization and quantization performance. HERO enables up to a 3.8% gain on test accuracy, up to 30% higher accuracy under 80% training label perturbation, and the best post-training quantization accuracy across a wide range of precision, including a > 10% accuracy improvement over SGD-trained models for common model architectures on various datasets.

References

[1]
Milad Alizadeh et al. 2020. Gradient l1 regularization for quantization robustness. arXiv preprint arXiv:2002.07520 (2020).
[2]
Ron Banner et al. 2018. Post-training 4-bit quantization of convolution networks for rapid-deployment. arXiv preprint arXiv:1810.05723 (2018).
[3]
Yoshua Bengio et al. 2013. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013).
[4]
Ekin D Cubuk et al. 2018. AutoAugment: Learning augmentation policies from data. arXiv preprint arXiv:1805.09501 (2018).
[5]
Jia Deng et al. 2009. ImageNet: A large-scale hierarchical image database. In ICCV.
[6]
Alhussein Fawzi et al. 2018. Empirical study of the topology and geometry of deep networks. In ICCV.
[7]
Pierre Foret et al. 2020. Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412 (2020).
[8]
Ian J Goodfellow et al. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).
[9]
Kaiming He et al. 2016. Deep residual learning for image recognition. In ICCV.
[10]
Mark Horowitz. 2014. 1.1 computing's energy problem (and what we can do about it). In ISSCC.
[11]
Gao Huang et al. 2016. Deep networks with stochastic depth. In ECCV.
[12]
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015).
[13]
Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Technical Report.
[14]
Anders Krogh and John A Hertz. 1991. A simple weight decay can improve generalization. In NeurIPS.
[15]
Hao Li et al. 2017. Visualizing the loss landscape of neural nets. arXiv preprint arXiv:1712.09913 (2017).
[16]
Junnan Li et al. 2020. DivideMix: Learning with noisy labels as semi-supervised learning. arXiv preprint arXiv:2002.07394 (2020).
[17]
Aleksander Madry et al. 2018. Towards deep learning models resistant to adversarial attacks. In ICLR.
[18]
Seyed-Mohsen Moosavi-Dezfooli et al. 2019. Robustness via curvature regularization, and vice versa. In ICCV.
[19]
Antonio Polino et al. 2018. Model compression via distillation and quantization. arXiv preprint arXiv:1802.05668 (2018).
[20]
Mark Sandler et al. 2018. MobileNetV2: Inverted residuals and linear bottlenecks. In ICCV.
[21]
Karen Simonyan and Andrew Zisserman 2014 Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[22]
Nitish Srivastava et al. 2014. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res (2014), 1929--1958.
[23]
Huanrui Yang et al. 2021. BSQ: Exploring bit-level sparsity for mixed-Precision neural network quantization. arXiv preprint arXiv:2102.10462 (2021).
[24]
Chiyuan Zhang et al. 2016. Understanding deep learning requires rethinking generalization. arXiv preprint arXiv:1611.03530 (2016).
[25]
Hongyi Zhang et al. 2017. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017).
[26]
Ritchie Zhao et al. 2019. Improving neural network quantization without retraining using outlier channel splitting. In ICML. 7543--7552.
[27]
Shuchang Zhou et al. 2016. DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160 (2016).

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference
July 2022
1462 pages
ISBN:9781450391429
DOI:10.1145/3489517
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 August 2022

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

DAC '22
Sponsor:
DAC '22: 59th ACM/IEEE Design Automation Conference
July 10 - 14, 2022
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)444
  • Downloads (Last 6 weeks)53
Reflects downloads up to 06 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media