skip to main content
10.1145/3274783.3274840acmconferencesArticle/Chapter ViewAbstractPublication PagessensysConference Proceedingsconference-collections
research-article
Public Access

FastDeepIoT: Towards Understanding and Optimizing Neural Network Execution Time on Mobile and Embedded Devices

Published: 04 November 2018 Publication History

Abstract

Deep neural networks show great potential as solutions to many sensing application problems, but their excessive resource demand slows down execution time, pausing a serious impediment to deployment on low-end devices. To address this challenge, recent literature focused on compressing neural network size to improve performance. We show that changing neural network size does not proportionally affect performance attributes of interest, such as execution time. Rather, extreme run-time nonlinearities exist over the network configuration space. Hence, we propose a novel framework, called FastDeepIoT, that uncovers the non-linear relation between neural network structure and execution time, then exploits that understanding to find network configurations that significantly improve the trade-off between execution time and accuracy on mobile and embedded devices. FastDeepIoT makes two key contributions. First, FastDeepIoT automatically learns an accurate and highly interpretable execution time model for deep neural networks on the target device. This is done without prior knowledge of either the hardware specifications or the detailed implementation of the used deep learning library. Second, FastDeepIoT informs a compression algorithm how to minimize execution time on the profiled device without impacting accuracy. We evaluate FastDeepIoT using three different sensing-related tasks on two mobile devices: Nexus 5 and Galaxy Nexus. FastDeepIoT further reduces the neural network execution time by 48% to 78% and energy consumption by 37% to 69% compared with the state-of-the-art compression algorithms.

References

[1]
Tensorflow benchmark tool. https://github.com/tensorflow/tensorflow/tree/r1.4/tensorflow/tools/benchmark.
[2]
Tensorflow mobile. https://www.tensorflow.org/mobile/mobile_intro.
[3]
S. Bhattacharya and N. D. Lane. Sparsification and separation of deep learning layers for constrained resource inference on wearables. In Proceedings of the 14th ACM Conference on Embedded Network Sensor Systems CD-ROM, pages 176--189. ACM, 2016.
[4]
L. Breiman. Random forests. Machine learning, 45(1):5--32, 2001.
[5]
L. Breiman. Classification and regression trees. Routledge, 2017.
[6]
N. Bui, A. Nguyen, P. Nguyen, H. Truong, A. Ashok, R. Deterding, and T. Vu. Pho2: Smartphone based blood oxygen level measurement systems using near-ir and red wave-guided light. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems. ACM, 2017.
[7]
B. Chen, V. Yenamandra, and K. Srinivasan. Tracking keystrokes using wireless signals. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services, pages 31--44. ACM, 2015.
[8]
H. Drucker, C. J. Burges, L. Kaufman, A. J. Smola, and V. Vapnik. Support vector regression machines. In Advances in neural information processing systems, pages 155--161, 1997.
[9]
M. Eichelberger, K. Luchsinger, S. Tanner, and R. Wattenhofer. Indoor localization with aircraft signals. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems, 2017.
[10]
J. H. Friedman. Greedy function approximation: a gradient boosting machine. Annals of statistics, pages 1189--1232, 2001.
[11]
P. Georgiev, S. Bhattacharya, N. D. Lane, and C. Mascolo. Low-resource multi-task audio sensing for mobile and embedded devices via shared deep neural network representations. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 1(3):50, 2017.
[12]
R. B. Girshick. Fast r-cnn. 2015 IEEE International Conference on Computer Vision (ICCV), pages 1440--1448, 2015.
[13]
S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv.1510.00149, 2015.
[14]
Y. He, X. Shen, Y. Liu, L. Mo, and G. Dai. Listen: Non-interactive localization in wireless camera sensor networks. In Real-Time Systems Symposium (RTSS), 2010 IEEE 31st, pages 205--214. IEEE, 2010.
[15]
A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861, 2017.
[16]
F. N. Iandola, M. W. Moskewicz, K. Ashraf, S. Han, W. J. Dally, and K. Keutzer. Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <1mb model size. CoRR, abs/1602.07360, 2016.
[17]
N. D. Lane, P. Georgiev, and L. Qendro. Deepear: robust smartphone audio sensing in unconstrained acoustic environments using deep learning. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pages 283--294. ACM, 2015.
[18]
K. Langendoen and N. Reijers. Distributed localization in wireless sensor networks: a quantitative comparison. Computer Networks, 43(4):499--518, 2003.
[19]
P. Lazik, N. Rajagopal, O. Shih, B. Sinopoli, and A. Rowe. Alps: A bluetooth and ultrasound platform for mapping and localization. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems, pages 73--84. ACM, 2015.
[20]
Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. nature, 521(7553):436, 2015.
[21]
C.-Y. Li, Y.-C. Chen, W.-J. Chen, P. Huang, and H.-h. Chu. Sensor-embedded teeth for oral activity recognition. In Proceedings of the 2013 international symposium on wearable computers, pages 41--44. ACM, 2013.
[22]
M. Mirshekari, S. Pan, P. Zhang, and H. Y. Noh. Characterizing wave propagation to improve indoor step-level person localization using floor vibration. In Sensors and Smart Structures Technologies for Civil, Mechanical, and Aerospace Systems 2016, volume 9803, page 980305. International Society for Optics and Photonics, 2016.
[23]
S. Nirjon,J. Gummeson, D. Gelb, and K.-H. Kim. Typingring: A wearable ring platform for text input. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services, pages 227--239. ACM, 2015.
[24]
V. Radu, C. Tong, S. Bhattacharya, N. D. Lane, C. Mascolo, M. K. Marina, and F. Kawsar. Multimodal deep learning for activity and context recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 1(4):157, 2018.
[25]
T. Rahman, A. T. Adams, R. V. Ravichandran, M. Zhang, S. N. Patel, J. A. Kientz, and T. Choudhury. Dopplesleep: A contactless unobtrusive sleep sensing system using short-range doppler radar. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pages 39--50. ACM, 2015.
[26]
S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91--99, 2015.
[27]
R. Rigamonti, A. Sironi, V. Lepetit, and P. Fua. Learning separable filters. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 2754--2761. IEEE, 2013.
[28]
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
[29]
J. M. Sorber, M. Shin, R. Peterson, and D. Kotz. Plug-n-trust: practical trusted sensing for mhealth. In Proceedings of the 10th international conference on Mobile systems, applications, and services, pages 309--322. ACM, 2012.
[30]
A. Stisen, H. Blunck, S. Bhattacharya, T. S. Prentow, M. B. Kjærgaard, A. Dey, T. Sonne, and M. M. Jensen. Smart devices are different: Assessing and mitigatingmobile sensing heterogeneities for activity recognition. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems, pages 127--140. ACM, 2015.
[31]
B. Wei, W. Hu, M. Yang, and C. T. Chou. Radio-based device-free activity recognition with radio frequency interference. In Proceedings of the 14th International Conference on Information Processing in Sensor Networks, pages 154--165. ACM, 2015.
[32]
H. Wen, Z. Xiao, N. Trigoni, and P. Blunsom. On assessing the accuracy of positioning systems in indoor environments. In European Conference on Wireless Sensor Networks, pages 1--17. Springer, 2013.
[33]
Y. Xiang, R. Piedrahita, R. P. Dick, M. Hannigan, Q. Lv, and L. Shang. A hybrid sensor system for indoor air quality monitoring. In Distributed Computing in Sensor Systems (DCOSS), 2013 IEEE International Conference on, pages 96--104. IEEE, 2013.
[34]
S. Yao, S. Hu, Y. Zhao, A. Zhang, and T. Abdelzaher. Deepsense: a unified deep learning framework for time-series mobile sensing data processing. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2017.
[35]
S. Yao, Y. Zhao, H. Shao, A. Zhang, C. Zhang, S. Li, and T. Abdelzaher. Rdeepsense: Reliable deep mobile computing models with uncertainty estimations. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 1(4):173, 2018.
[36]
S. Yao, Y. Zhao, H. Shao, C. Zhang, A. Zhang, S. Hu, D. Liu, S. Liu, L. Su, and T. Abdelzaher. Sensegan: Enabling deep learning for internet of things with a semi-supervised framework. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2(3):144, 2018.
[37]
S. Yao, Y. Zhao, A. Zhang, S. Hu, H. Shao, C. Zhang, L. Su, and T. Abdelzaher. Deep learning for the internet of things. Computer, 51(5):32--41, 2018.
[38]
S. Yao, Y. Zhao, A. Zhang, L. Su, and T. Abdelzaher. Deepiot: Compressing deep neural network structures for sensing systems with a compressor-critic framework. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems. ACM, 2017.
[39]
H. Zhang, W. Du, P. Zhou, M. Li, and P. Mohapatra. Dopenc: acoustic-based encounter profiling using smartphones. In Proceedings of the 22nd Annual International Conference on Mobile Computing and Networking, pages 294--307. ACM, 2016.
[40]
X. Zhang, X. Zhou, M. Lin, and J. Sun. Shufflenet: An extremely efficient convolutional neural network for mobile devices. CoRR, abs/1707.01083, 2017.

Cited By

View all

Index Terms

  1. FastDeepIoT: Towards Understanding and Optimizing Neural Network Execution Time on Mobile and Embedded Devices

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        SenSys '18: Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems
        November 2018
        449 pages
        ISBN:9781450359528
        DOI:10.1145/3274783
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 04 November 2018

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Deep Learning
        2. Execution Time
        3. Internet of Things
        4. Mobile Computing
        5. Model Compression

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Funding Sources

        Conference

        Acceptance Rates

        Overall Acceptance Rate 174 of 867 submissions, 20%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)306
        • Downloads (Last 6 weeks)68
        Reflects downloads up to 03 Nov 2024

        Other Metrics

        Citations

        Cited By

        View all

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media