skip to main content
10.1145/3564121.3564137acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaimlsystemsConference Proceedingsconference-collections
research-article

Unsupervised Early Exit in DNNs with Multiple Exits

Published: 16 May 2023 Publication History

Abstract

Deep Neural Networks (DNNs) are generally designed as sequentially cascaded differentiable blocks/layers with a prediction module connected only to its last layer. DNNs can be attached with prediction modules at multiple points along the backbone where inference can stop at an intermediary stage without passing through all the modules. The last exit point may offer a better prediction error but also involves more computational resources and latency. An exit point that is ‘optimal’ in terms of both prediction error and cost is desirable. The optimal exit point may depend on the latent distribution of the tasks and may change from one task type to another. During neural inference, the ground truth of instances may not be available and the error rates at each exit point cannot be estimated. Hence one is faced with the problem of selecting the optimal exit in an unsupervised setting. Prior works tackled this problem in an offline supervised setting assuming that enough labeled data is available to estimate the error rate at each exit point and tune the parameters for better accuracy. However, pre-trained DNNs are often deployed in new domains for which a large amount of ground truth may not be available. We thus model the problem of exit selection as an unsupervised online learning problem and leverage the bandit theory to identify the optimal exit point. Specifically, we focus on the Elastic BERT, a pre-trained multi-exit DNN to demonstrate that it ‘nearly’ satisfies the Strong Dominance (SD) property making it possible to learn the optimal exit in an online setup without knowing the ground truth labels. We develop upper confidence bound (UCB) based algorithm named UEE-UCB that provably achieves sub-linear regret under the SD property. Thus our method provides a means to adaptively learn domain-specific optimal exit points in multi-exit DNNs. We empirically validate our algorithm on IMDb and Yelp datasets.

References

[1]
Nabiha Asghar. 2016. Yelp Dataset Challenge: Review Rating Prediction. CoRR abs/1605.05362(2016). http://arxiv.org/abs/1605.05362
[2]
Peter Auer 2002. Finite-time Analysis of the Multiarmed Bandit Problem. Machine Learning 47(2002), 235–256.
[3]
Ankur Bapna, Naveen Arivazhagan, and Orhan Firat. 2020. Controlling computation versus quality for neural sequence models. arXiv preprint arXiv:2002.07106(2020).
[4]
Gábor Bartók, Dean P. Foster, Dávid Pál, Alexander Rakhlin, and Csaba Szepesvári. 2014. Partial Monitoring—Classification, Regret Bounds, and Algorithms. Mathematics of Operations Research 39, 4 (2014), 967–997.
[5]
B. Barla Cambazoglu, Hugo Zaragoza, Olivier Chapelle, Jiang Chen, Ciya Liao, Zhaohui Zheng, and Jon Degenhardt. 2010. Early exit optimizations for additive machine learned ranking systems. In Proceedings of the Third ACM International Conference on Web Search and Data Mining (WSDM. 411–420.
[6]
Raj Dabre, Raphael Rubino, and Atsushi Fujita. 2020. Balancing Cost and Benefit with Tied-Multi Transformers. In Proceedings of the Fourth Workshop on Neural Generation and Translation, NGT@ACL 2020, Online, July 5-10, 2020, Alexandra Birch, Andrew M. Finch, Hiroaki Hayashi, Kenneth Heafield, Marcin Junczys-Dowmunt, Ioannis Konstas, Xian Li, Graham Neubig, and Yusuke Oda (Eds.). Association for Computational Linguistics, 24–34. https://doi.org/10.18653/v1/2020.ngt-1.3
[7]
Xin Dai, Xiangnan Kong, and Tian Guo. 2020. EPNet: Learning to exit with flexible multi-branch network. In ACM Int. Conf. on Information & Knowledge Management. 235–244.
[8]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1. 4171– 4186.
[9]
Maha Elbayad, Jiatao Gu, Edouard Grave, and Michael Auli. 2020. Depth-Adaptive Transformer. In In Proc. of ICLR.
[10]
Biyi Fang, Xiao Zeng, Faen Zhang, Hui Xu, and Mi Zhang. 2020. FlexDNN: Input-adaptive on-device deep learning for efficient mobile vision. In IEEE/ACM Symposium on Edge Computing (SEC). 84–95.
[11]
Manjesh Hanawal, Csaba Szepesvari, and Venkatesh Saligrama. 2017. Unsupervised sequential sensor acquisition. In Artificial Intelligence and Statistics. PMLR, 803–811.
[12]
JGao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, and Kilian Weinberger. 2018. Multi-scale dense networks for resource efficient image classification. In In Proceedings of the 6th International Conference on Learning Representations.
[13]
Weiyu Ju, Wei Bao, Liming Ge, and Dong Yuan. 2021. Dynamic Early Exit Scheduling for Deep Neural Network Inference through Contextual Bandits. In ACM Int. Conf. on Information & Knowledge Management. 823–832.
[14]
Weiyu Ju, Wei Bao, Dong Yuan, Liming Ge, and Bing Bing Zhou. 2021. Learning Early Exit for Deep Neural Network Inference on Mobile Devices through Multi-Armed Bandits. In IEEE/ACM Int. Symposium on Cluster, Cloud and Internet Computing (CCGrid). 11–20.
[15]
Yigitcan Kaya, Sanghyun Hong, and Tudor Dumitras. 2019. Shallow-deep networks: Understanding and mitigating network overthinking. In International conference on machine learning. PMLR, 3301–3310.
[16]
Geonho Kim and Jongsun Park. 2020. Low Cost Early Exit Decision Unit Design for CNN Accelerator. In IEEE Int. SoC Design Conf.127–128.
[17]
Stefanos Laskaridis, Stylianos I Venieris, Mario Almeida, Ilias Leontiadis, and Nicholas D Lane. 2020. SPINN: synergistic progressive inference of neural networks over device and cloud. In Int. Conf. on Mobile Computing and Networking (MobiCom). 1–15.
[18]
En Li, Liekang Zeng, Zhi Zhou, and Xu Chen. 2019. Edge AI: On-demand accelerating deep neural network inference via edge computing. IEEE Transactions on Wireless Communications 19, 1(2019), 447–457.
[19]
Xiangyang Liu, Tianxiang Sun, Junliang He, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, and Xipeng Qiu. 2021. Towards Efficient NLP: A Standard Evaluation and A Strong Baseline. (2021). https://arxiv.org/abs/2110.07038
[20]
Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. 2011. Learning Word Vectors for Sentiment Analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Portland, Oregon, USA, 142–150. http://www.aclweb.org/anthology/P11-1015
[21]
Roberto G. Pacheco, Rodrigo S. Couto, and O. Simeone. 2021. Calibration-Aided Edge Inference Offloading via Adaptive Model Partitioning of Deep Neural Networks. In IEEE Int. Conf. on Communications (ICC). 1–6.
[22]
Roberto G. Pacheco, Fernanda D. V. R. Oliveira, and Rodrigo S. Couto. 2021. Early-exit deep neural networks for distorted images: providing an efficient edge offloading. In IEEE Global Communications Conf. (GLOBECOM). 1–6. https://doi.org/10.1109/GLOBECOM46510.2021.9685469
[23]
Simone Scardapane, Michele Scarpiniti, Enzo Baccarelli, and Aurelio Uncini. 2020. Why Should We Add Early Exits to Neural Networks?Cognitive Computation 12, 5 (jun 2020), 954–966. https://doi.org/10.1007/s12559-020-09734-4
[24]
Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Ng, and Christopher Potts. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1631–1642.
[25]
Surat Teerapittayanon, Bradley McDanel, and Hsiang-Tsung Kung. 2016. Branchynet: Fast inference via early exiting from deep neural networks. In 2016 23rd International Conference on Pattern Recognition (ICPR). IEEE, 2464–2469.
[26]
Arun Verma, Manjesh Hanawal, Csaba Szepesvari, and Venkatesh Saligrama. 2019. Online Algorithm for Unsupervised Sensor Selection. In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics. PMLR, 3168–3176.
[27]
Meiqi Wang, Jianqiao Mo, Jun Lin, Zhongfeng Wang, and Li Du. 2019. DynExit: A Dynamic Early-Exit Strategy for Deep Residual Networks. In IEEE Int. Workshop on Signal Processing Systems (SiPS). 178–183.
[28]
Zizhao Wang, Wei Bao, Dong Yuan, Liming Ge, Nguyen H Tran, and Albert Y Zomaya. 2019. SEE: Scheduling early exit for mobile DNN inference during service outage. In ACM Int. Conf. on Modeling, Analysis and Simulation of Wireless and Mobile Systems. 279–288.
[29]
Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, and Jimmy Lin. 2020. DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2246–2251.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
AIMLSystems '22: Proceedings of the Second International Conference on AI-ML Systems
October 2022
209 pages
ISBN:9781450398473
DOI:10.1145/3564121
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 May 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bandit algorithms
  2. early exit in DNNs
  3. multi-exit DNNs
  4. unsupervised online learning

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • SERB under the MATRICS grant

Conference

AIMLSystems 2022

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 69
    Total Downloads
  • Downloads (Last 12 months)31
  • Downloads (Last 6 weeks)3
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media