research-article

Public Access

FastDeepIoT: Towards Understanding and Optimizing Neural Network Execution Time on Mobile and Embedded Devices

Authors:

ShengZhong Liu,

Tarek AbdelzaherAuthors Info & Claims

SenSys '18: Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems

Pages 278 - 291

https://doi.org/10.1145/3274783.3274840

Published: 04 November 2018 Publication History

Abstract

Deep neural networks show great potential as solutions to many sensing application problems, but their excessive resource demand slows down execution time, pausing a serious impediment to deployment on low-end devices. To address this challenge, recent literature focused on compressing neural network size to improve performance. We show that changing neural network size does not proportionally affect performance attributes of interest, such as execution time. Rather, extreme run-time nonlinearities exist over the network configuration space. Hence, we propose a novel framework, called FastDeepIoT, that uncovers the non-linear relation between neural network structure and execution time, then exploits that understanding to find network configurations that significantly improve the trade-off between execution time and accuracy on mobile and embedded devices. FastDeepIoT makes two key contributions. First, FastDeepIoT automatically learns an accurate and highly interpretable execution time model for deep neural networks on the target device. This is done without prior knowledge of either the hardware specifications or the detailed implementation of the used deep learning library. Second, FastDeepIoT informs a compression algorithm how to minimize execution time on the profiled device without impacting accuracy. We evaluate FastDeepIoT using three different sensing-related tasks on two mobile devices: Nexus 5 and Galaxy Nexus. FastDeepIoT further reduces the neural network execution time by 48% to 78% and energy consumption by 37% to 69% compared with the state-of-the-art compression algorithms.

References

[1]

Tensorflow benchmark tool. https://github.com/tensorflow/tensorflow/tree/r1.4/tensorflow/tools/benchmark.

[2]

Tensorflow mobile. https://www.tensorflow.org/mobile/mobile_intro.

[3]

S. Bhattacharya and N. D. Lane. Sparsification and separation of deep learning layers for constrained resource inference on wearables. In Proceedings of the 14th ACM Conference on Embedded Network Sensor Systems CD-ROM, pages 176--189. ACM, 2016.

Digital Library

[4]

L. Breiman. Random forests. Machine learning, 45(1):5--32, 2001.

Digital Library

[5]

L. Breiman. Classification and regression trees. Routledge, 2017.

[6]

N. Bui, A. Nguyen, P. Nguyen, H. Truong, A. Ashok, R. Deterding, and T. Vu. Pho2: Smartphone based blood oxygen level measurement systems using near-ir and red wave-guided light. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems. ACM, 2017.

Digital Library

[7]

B. Chen, V. Yenamandra, and K. Srinivasan. Tracking keystrokes using wireless signals. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services, pages 31--44. ACM, 2015.

Digital Library

[8]

H. Drucker, C. J. Burges, L. Kaufman, A. J. Smola, and V. Vapnik. Support vector regression machines. In Advances in neural information processing systems, pages 155--161, 1997.

Digital Library

[9]

M. Eichelberger, K. Luchsinger, S. Tanner, and R. Wattenhofer. Indoor localization with aircraft signals. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems, 2017.

Digital Library

[10]

J. H. Friedman. Greedy function approximation: a gradient boosting machine. Annals of statistics, pages 1189--1232, 2001.

[11]

P. Georgiev, S. Bhattacharya, N. D. Lane, and C. Mascolo. Low-resource multi-task audio sensing for mobile and embedded devices via shared deep neural network representations. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 1(3):50, 2017.

Digital Library

[12]

R. B. Girshick. Fast r-cnn. 2015 IEEE International Conference on Computer Vision (ICCV), pages 1440--1448, 2015.

Digital Library

[13]

S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv.1510.00149, 2015.

[14]

Y. He, X. Shen, Y. Liu, L. Mo, and G. Dai. Listen: Non-interactive localization in wireless camera sensor networks. In Real-Time Systems Symposium (RTSS), 2010 IEEE 31st, pages 205--214. IEEE, 2010.

Digital Library

[15]

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861, 2017.

[16]

F. N. Iandola, M. W. Moskewicz, K. Ashraf, S. Han, W. J. Dally, and K. Keutzer. Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <1mb model size. CoRR, abs/1602.07360, 2016.

[17]

N. D. Lane, P. Georgiev, and L. Qendro. Deepear: robust smartphone audio sensing in unconstrained acoustic environments using deep learning. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pages 283--294. ACM, 2015.

Digital Library

[18]

K. Langendoen and N. Reijers. Distributed localization in wireless sensor networks: a quantitative comparison. Computer Networks, 43(4):499--518, 2003.

Digital Library

[19]

P. Lazik, N. Rajagopal, O. Shih, B. Sinopoli, and A. Rowe. Alps: A bluetooth and ultrasound platform for mapping and localization. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems, pages 73--84. ACM, 2015.

Digital Library

[20]

Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. nature, 521(7553):436, 2015.

[21]

C.-Y. Li, Y.-C. Chen, W.-J. Chen, P. Huang, and H.-h. Chu. Sensor-embedded teeth for oral activity recognition. In Proceedings of the 2013 international symposium on wearable computers, pages 41--44. ACM, 2013.

Digital Library

[22]

M. Mirshekari, S. Pan, P. Zhang, and H. Y. Noh. Characterizing wave propagation to improve indoor step-level person localization using floor vibration. In Sensors and Smart Structures Technologies for Civil, Mechanical, and Aerospace Systems 2016, volume 9803, page 980305. International Society for Optics and Photonics, 2016.

[23]

S. Nirjon,J. Gummeson, D. Gelb, and K.-H. Kim. Typingring: A wearable ring platform for text input. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services, pages 227--239. ACM, 2015.

Digital Library

[24]

V. Radu, C. Tong, S. Bhattacharya, N. D. Lane, C. Mascolo, M. K. Marina, and F. Kawsar. Multimodal deep learning for activity and context recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 1(4):157, 2018.

Digital Library

[25]

T. Rahman, A. T. Adams, R. V. Ravichandran, M. Zhang, S. N. Patel, J. A. Kientz, and T. Choudhury. Dopplesleep: A contactless unobtrusive sleep sensing system using short-range doppler radar. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pages 39--50. ACM, 2015.

Digital Library

[26]

S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91--99, 2015.

Digital Library

[27]

R. Rigamonti, A. Sironi, V. Lepetit, and P. Fua. Learning separable filters. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 2754--2761. IEEE, 2013.

Digital Library

[28]

K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

[29]

J. M. Sorber, M. Shin, R. Peterson, and D. Kotz. Plug-n-trust: practical trusted sensing for mhealth. In Proceedings of the 10th international conference on Mobile systems, applications, and services, pages 309--322. ACM, 2012.

Digital Library

[30]

A. Stisen, H. Blunck, S. Bhattacharya, T. S. Prentow, M. B. Kjærgaard, A. Dey, T. Sonne, and M. M. Jensen. Smart devices are different: Assessing and mitigatingmobile sensing heterogeneities for activity recognition. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems, pages 127--140. ACM, 2015.

Digital Library

[31]

B. Wei, W. Hu, M. Yang, and C. T. Chou. Radio-based device-free activity recognition with radio frequency interference. In Proceedings of the 14th International Conference on Information Processing in Sensor Networks, pages 154--165. ACM, 2015.

Digital Library

[32]

H. Wen, Z. Xiao, N. Trigoni, and P. Blunsom. On assessing the accuracy of positioning systems in indoor environments. In European Conference on Wireless Sensor Networks, pages 1--17. Springer, 2013.

Digital Library

[33]

Y. Xiang, R. Piedrahita, R. P. Dick, M. Hannigan, Q. Lv, and L. Shang. A hybrid sensor system for indoor air quality monitoring. In Distributed Computing in Sensor Systems (DCOSS), 2013 IEEE International Conference on, pages 96--104. IEEE, 2013.

Digital Library

[34]

S. Yao, S. Hu, Y. Zhao, A. Zhang, and T. Abdelzaher. Deepsense: a unified deep learning framework for time-series mobile sensing data processing. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2017.

Digital Library

[35]

S. Yao, Y. Zhao, H. Shao, A. Zhang, C. Zhang, S. Li, and T. Abdelzaher. Rdeepsense: Reliable deep mobile computing models with uncertainty estimations. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 1(4):173, 2018.

Digital Library

[36]

S. Yao, Y. Zhao, H. Shao, C. Zhang, A. Zhang, S. Hu, D. Liu, S. Liu, L. Su, and T. Abdelzaher. Sensegan: Enabling deep learning for internet of things with a semi-supervised framework. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2(3):144, 2018.

Digital Library

[37]

S. Yao, Y. Zhao, A. Zhang, S. Hu, H. Shao, C. Zhang, L. Su, and T. Abdelzaher. Deep learning for the internet of things. Computer, 51(5):32--41, 2018.

[38]

S. Yao, Y. Zhao, A. Zhang, L. Su, and T. Abdelzaher. Deepiot: Compressing deep neural network structures for sensing systems with a compressor-critic framework. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems. ACM, 2017.

Digital Library

[39]

H. Zhang, W. Du, P. Zhou, M. Li, and P. Mohapatra. Dopenc: acoustic-based encounter profiling using smartphones. In Proceedings of the 22nd Annual International Conference on Mobile Computing and Networking, pages 294--307. ACM, 2016.

Digital Library

[40]

X. Zhang, X. Zhou, M. Lin, and J. Sun. Shufflenet: An extremely efficient convolutional neural network for mobile devices. CoRR, abs/1707.01083, 2017.

Cited By

Bañuelos JSigg SHe JSalim FCosta-Requena J(2024)Generating Multivariate Synthetic Time Series Data for Absent Sensors from Correlated SourcesProceedings of the 2nd International Workshop on Networked AI Systems10.1145/3662004.3663553(19-24)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3662004.3663553
Rastikerdar MHuang JFang SGuan HGanesan DOkoshi TKo JLiKamWa R(2024)CACTUS: Dynamically Switchable Context-aware micro-Classifiers for Efficient IoT InferenceProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661888(505-518)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3643832.3661888
Mishra RGupta HBanga GDas S(2024)Fed-RAC: Resource-Aware Clustering for Tackling Heterogeneity of Participants in Federated LearningIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.337993335:7(1207-1220)Online publication date: Jul-2024
https://doi.org/10.1109/TPDS.2024.3379933
Show More Cited By

Index Terms

FastDeepIoT: Towards Understanding and Optimizing Neural Network Execution Time on Mobile and Embedded Devices

Recommendations

DeepIoT: Compressing Deep Neural Network Structures for Sensing Systems with a Compressor-Critic Framework
SenSys '17: Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems

Recent advances in deep learning motivate the use of deep neutral networks in sensing applications, but their excessive resource needs on constrained embedded devices remain an important impediment. A recently explored solution space lies in compressing (...
Efficient Execution of Deep Neural Networks on Mobile Devices with NPU
IPSN '21: Proceedings of the 20th International Conference on Information Processing in Sensor Networks (co-located with CPS-IoT Week 2021)

Many Deep Neural Network (DNN) based applications have been developed and run on mobile devices. Although these advanced DNN models can provide better results, they also suffer from high computational overhead which means long delay and more energy ...
EIE: efficient inference engine on compressed deep neural network
ISCA'16

State-of-the-art deep neural networks (DNNs) have hundreds of millions of connections and are both computationally and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources and power budgets. While custom ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SenSys '18: Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems

November 2018

449 pages

ISBN:9781450359528

DOI:10.1145/3274783

Editors:
Gowri Sankar Ramachandran
University of Southern California, Los Angeles
,
Bhaskar Krishnamachari
University of Southern California, Los Angeles

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 November 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Conference

SenSys '18

Sponsor:

SenSys '18: The 16th ACM Conference on Embedded Networked Sensor Systems

November 4 - 7, 2018

Shenzhen, China

Acceptance Rates

Overall Acceptance Rate 174 of 867 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

109
Total Citations
View Citations
2,077
Total Downloads

Downloads (Last 12 months)306
Downloads (Last 6 weeks)68

Reflects downloads up to 03 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Bañuelos JSigg SHe JSalim FCosta-Requena J(2024)Generating Multivariate Synthetic Time Series Data for Absent Sensors from Correlated SourcesProceedings of the 2nd International Workshop on Networked AI Systems10.1145/3662004.3663553(19-24)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3662004.3663553
Rastikerdar MHuang JFang SGuan HGanesan DOkoshi TKo JLiKamWa R(2024)CACTUS: Dynamically Switchable Context-aware micro-Classifiers for Efficient IoT InferenceProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661888(505-518)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3643832.3661888
Mishra RGupta HBanga GDas S(2024)Fed-RAC: Resource-Aware Clustering for Tackling Heterogeneity of Participants in Federated LearningIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.337993335:7(1207-1220)Online publication date: Jul-2024
https://doi.org/10.1109/TPDS.2024.3379933
Mishra RGupta H(2024)Designing and Training of Lightweight Neural Networks on Edge Devices using Early Halting in Knowledge DistillationIEEE Transactions on Mobile Computing10.1109/TMC.2023.3297026(1-12)Online publication date: 2024
https://doi.org/10.1109/TMC.2023.3297026
Ahn HLee MSeong SNa GChun IVarghese BHong C(2024)ScissionLite: Accelerating Distributed Deep Learning With Lightweight Data Compression for IIoTIEEE Transactions on Industrial Informatics10.1109/TII.2024.341334020:10(11950-11960)Online publication date: Oct-2024
https://doi.org/10.1109/TII.2024.3413340
Biswas SBarma S(2024)MicrosMobiNet: A Deep Lightweight Network With Hierarchical Feature Fusion Scheme for Microscopy Image Analysis in Mobile-Edge ComputingIEEE Internet of Things Journal10.1109/JIOT.2023.331787811:5(8288-8298)Online publication date: 1-Mar-2024
https://doi.org/10.1109/JIOT.2023.3317878
Rupprecht BVogel-Heuser BMöhrle JHujo DWang Y(2024)Sparse Measurement Algorithm Execution Time Prediction on Heterogeneous Edge Devices for Early Stage Software-Hardware Matching2024 IEEE 7th International Conference on Industrial Cyber-Physical Systems (ICPS)10.1109/ICPS59941.2024.10640034(1-8)Online publication date: 12-May-2024
https://doi.org/10.1109/ICPS59941.2024.10640034
Kimura TLi JWang TKara DChen YHu YWang RWigness MLiu SSrivastava MDiggavi SAbdelzaher T(2024)On the Efficiency and Robustness of Vibration-Based Foundation Models for IoT Sensing: A Case Study2024 IEEE International Workshop on Foundation Models for Cyber-Physical Systems & Internet of Things (FMSys)10.1109/FMSys62467.2024.00006(7-12)Online publication date: 13-May-2024
https://doi.org/10.1109/FMSys62467.2024.00006
Du LLan G(2023)FreeGaze: Resource-efficient Gaze Estimation via Frequency-domain Contrastive LearningProceedings of the 2023 International Conference on embedded Wireless Systems and Networks10.5555/3639940.3639949(60-71)Online publication date: 15-Dec-2023
https://dl.acm.org/doi/10.5555/3639940.3639949
Zhang SChen YZhang SChen Z(2023)DeepSlicingPrinciples and Applications of Adaptive Artificial Intelligence10.4018/979-8-3693-0230-9.ch006(123-150)Online publication date: 29-Dec-2023
https://doi.org/10.4018/979-8-3693-0230-9.ch006
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents