skip to main content
10.1145/3241539.3241563acmconferencesArticle/Chapter ViewAbstractPublication PagesmobicomConference Proceedingsconference-collections
research-article

DeepCache: Principled Cache for Mobile Deep Vision

Published: 15 October 2018 Publication History

Abstract

We present DeepCache, a principled cache design for deep learning inference in continuous mobile vision. DeepCache benefits model execution efficiency by exploiting temporal locality in input video streams. It addresses a key challenge raised by mobile vision: the cache must operate under video scene variation, while trading off among cacheability, overhead, and loss in model accuracy. At the input of a model, DeepCache discovers video temporal locality by exploiting the video's internal structure, for which it borrows proven heuristics from video compression; into the model, DeepCache propagates regions of reusable results by exploiting the model's internal structure. Notably, DeepCache eschews applying video heuristics to model internals which are not pixels but high-dimensional, difficult-to-interpret data. Our implementation of DeepCache works with unmodified deep learning models, requires zero developer's manual effort, and is therefore immediately deployable on off-the-shelf mobile devices. Our experiments show that DeepCache saves inference execution time by 18% on average and up to 47%. DeepCache reduces system energy consumption by 20% on average.

References

[1]
2015. How Google Translate squeezes deep learning onto a phone. https://research.googleblog.com/2015/07/ how-google-translate-squeezes-deep.html.
[2]
2016. Apple moves to third-generation Siri back-end, built on open-source Mesos platform. https://9to5mac.com/2015/04/27/ siri-backend-mesos/.
[3]
2016. TensorZoom App. https://play.google.com/store/apps/details? id=uk.tensorzoom&hl=en.
[4]
2017. Amazon App. https://itunes.apple.com/us/app/ amazon-app-shop-scan-compare/id297606951?mt=8.
[5]
2017. Autopilot-TensorFlow. https://github.com/SullyChen/ Autopilot-TensorFlow.
[6]
2017. Caffe2 deep learning framework. https://github.com/caffe2/ caffe2.
[7]
2017. Concat Layer. http://caffe.berkeleyvision.org/tutorial/layers/ concat.html.
[8]
2017. ffmpeg: a video processing platform. https://www.ffmpeg.org/.
[9]
2017. HTTP Caching. https://developers.google.com/web/ fundamentals/performance/optimizing-content-efficiency/ http-caching.
[10]
2017. Local Response Normalization (LRN). http://caffe.berkeleyvision. org/tutorial/layers/lrn.html.
[11]
2017. Ncnn: a high-performance neural network inference framework. https://github.com/Tencent/ncnn.
[12]
2017. Nvidia driving dataset. https://drive.google.com/file/d/ 0B-KJCaaF7elleG1RbzVPZWV4Tlk/view?usp=sharing.
[13]
2017. Peak signal-to-noise ratio. https://en.wikipedia.org/wiki/Peak_ signal-to-noise_ratio.
[14]
2017. RenderScript. https://developer.android.com/guide/topics/ renderscript/compute.html.
[15]
2017. Snapdragon Profiler. https://developer.qualcomm.com/software/ snapdragon-profiler.
[16]
2017. Softmax Layer. http://caffe.berkeleyvision.org/tutorial/layers/ softmax.html.
[17]
2017. TensorFlow. https://www.tensorflow.org/.
[18]
2017. TensorFlow Android Camera Demo. https://github.com/ tensorflow/tensorflow/tree/master/tensorflow/examples/android.
[19]
2017. The PASCAL Visual Object Classes. http://host.robots.ox.ac.uk/ pascal/VOC/.
[20]
2018. Android MediaCodec. https://developer.android.com/reference/ android/media/MediaCodec.html.
[21]
Aroh Barjatya. 2004. Block matching algorithms for motion estimation. IEEE Transactions Evolution Computation 8, 3 (2004), 225--239.
[22]
Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, LawrenceDJackel, MathewMonfort, Urs Muller, Jiakai Zhang, et al. 2016. End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316 (2016).
[23]
Mark Buckler, Philip Bedoukian, Suren Jayasuriya, and Adrian Sampson. 2018. EVA2: Exploiting Temporal Redundancy in Live Computer Vision. Proceedings of the 45th Annual International Symposium on Computer Architecture, ISCA'18 (2018).
[24]
Lukas Cavigelli, Philippe Degen, and Luca Benini. 2017. CBinfer: Change-Based Inference for Convolutional Neural Networks on Video Data. arXiv preprint arXiv:1704.04313 (2017).
[25]
Guoguo Chen, Carolina Parada, and Georg Heigold. 2014. Smallfootprint Keyword Spotting Using Deep Neural Networks. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'14). 4087--4091.
[26]
Tianshi Chen, Zidong Du, Ninghui Sun, Jia Wang, Chengyong Wu, Yunji Chen, and Olivier Temam. 2014. DianNao: a Small-footprint High-throughput Accelerator for Ubiquitous Machine-Learning. In Proceedings of the Architectural Support for Programming Languages and Operating Systems (ASPLOS'14). 269--284.
[27]
Tiffany Yu-Han Chen, Lenin Ravindranath, Shuo Deng, Paramvir Bahl, and Hari Balakrishnan. 2015. Glimpse: Continuous, Real-Time Object Recognition on Mobile Devices. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems (SenSys'15). 155--168.
[28]
Yu-Hsin Chen, Joel S. Emer, and Vivienne Sze. 2016. Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks. In Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, (ISCA'16). 367--379.
[29]
Zhuo Chen, Wenlu Hu, Junjue Wang, Siyan Zhao, Brandon Amos, Guanhang Wu, Kiryong Ha, Khalid Elgazzar, Padmanabhan Pillai, Roberta Klatzky, Daniel Siewiorek, and Mahadev Satyanarayanan. 2017. An Empirical Study of Latency in an Emerging Class of Edge Computing Applications for Wearable Cognitive Assistance. In Proceedings of the Second ACM/IEEE Symposium on Edge Computing (SEC '17).
[30]
Anupam Das, Martin Degeling, Xiaoyou Wang, Junjue Wang, Norman M. Sadeh, and Mahadev Satyanarayanan. 2017. Assisting Users in aWorld Full of Cameras: A Privacy-Aware Infrastructure for Computer Vision Applications. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR'17. 1387--1396.
[31]
Emily L. Denton,Wojciech Zaremba, Joan Bruna, Yann LeCun, and Rob Fergus. 2014. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS'14). 1269--1277.
[32]
Peizhen Guo andWenjun Hu. 2018. Potluck: Cross-Application Approximate Deduplication for Computation-Intensive Mobile Applications. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems. 271--284.
[33]
Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, and William J. Dally. 2016. EIE: Efficient Inference Engine on Compressed Deep Neural Network. In Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, (ISCA'16). 243--254.
[34]
Seungyeop Han, Haichen Shen, Matthai Philipose, Sharad Agarwal, Alec Wolman, and Arvind Krishnamurthy. 2016. MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys'16). 123--136.
[35]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'16). 770-- 778.
[36]
Steve Hodges, Lyndsay Williams, Emma Berry, Shahram Izadi, James Srinivasan, Alex Butler, Gavin Smyth, Narinder Kapur, and Kenneth R. Wood. 2006. SenseCam: A Retrospective Memory Aid. In UbiComp 2006: Ubiquitous Computing, 8th International Conference, UbiComp 2006, Orange County, CA, USA, September 17--21, 2006. 177--193.
[37]
Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv preprint arXiv:1704.04861 (2017).
[38]
Gao Huang, Zhuang Liu, Kilian Q Weinberger, and Laurens van der Maaten. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, Vol. 1. 3.
[39]
Forrest N. Iandola, Matthew W. Moskewicz, Khalid Ashraf, Song Han, William J. Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level Accuracy with 50x Fewer Parameters and <1MB Model Size. arXiv preprint arXiv:1602.07360 (2016).
[40]
Puneet Jain, Justin Manweiler, and Romit Roy Choudhury. 2015. Over- Lay: Practical Mobile Augmented Reality. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys'15). 331--344.
[41]
Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Fei-Fei Li. 2014. Large-Scale Video Classification with Convolutional Neural Networks. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'14). 1725--1732.
[42]
Kleomenis Katevas, Ilias Leontiadis, Martin Pielot, and Joan Serrà. 2017. Practical Processing of Mobile Sensor Data for Continual Deep Learning Predictions. In Proceedings of the 1st International Workshop on Embedded and Mobile Deep Learning (Deep Learning for Mobile Systems and Applications) (EMDL@MobiSys'17). 19--24.
[43]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the 26th Annual Conference on Neural Information Processing Systems (NIPS'12). 1106--1114.
[44]
Nicholas D. Lane, Sourav Bhattacharya, Petko Georgiev, Claudio Forlivesi, Lei Jiao, Lorena Qendro, and Fahim Kawsar. 2016. DeepX: A Software Accelerator for Low-power Deep Learning Inference on Mobile Devices. In 15th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN 2016). 23:1--23:12.
[45]
Nicholas D. Lane, Petko Georgiev, and Lorena Qendro. 2015. Deep- Ear: Robust Smartphone Audio Sensing in Unconstrained Acoustic Environments Using Deep Learning. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp'15). 283--294.
[46]
Didier Le Gall. 1991. MPEG: A video compression standard for multimedia applications. Commun. ACM 34, 4 (1991), 46--58.
[47]
Robert LiKamWa, Yunhui Hou, Yuan Gao, Mia Polansky, and Lin Zhong. 2016. RedEye: Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision. In Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture ISCA'16. 255--266.
[48]
Robert LiKamWa, Bodhi Priyantha, Matthai Philipose, Lin Zhong, and Paramvir Bahl. 2013. Energy characterization and optimization of image sensing toward continuous mobile vision. In International Conference on Mobile Systems, Applications, and Services (MobiSys'13). 69--82.
[49]
Robert LiKamWa, Bodhi Priyantha, Matthai Philipose, Lin Zhong, and Paramvir Bahl. 2013. Energy proportional image sensors for continuous mobile vision. In International Conference on Mobile systems, Applications, and Services (MobiSys'13). 467--468.
[50]
Robert LiKamWa and Lin Zhong. 2015. Starfish: Efficient Concurrency Support for Computer Vision Applications. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys'15). 213--226.
[51]
Huynh Nguyen Loc, Youngki Lee, and Rajesh Krishna Balan. 2017. DeepMon: Mobile GPU-based Deep Learning Framework for Continuous Vision Applications. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys'17). 82--95.
[52]
Akhil Mathur, Nicholas D. Lane, Sourav Bhattacharya, Aidan Boran, Claudio Forlivesi, and Fahim Kawsar. 2017. DeepEye: Resource Efficient Local Execution of Multiple Deep Vision Models using Wearable Commodity Hardware. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys'17). 68--81.
[53]
Zhonghong Ou, Changwei Lin, Meina Song, and Haihong E. 2017. A CNN-Based Supermarket Auto-Counting System. In Proceedings of the 17th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP'17). 359--371.
[54]
Joseph Redmon, Santosh Kumar Divvala, Ross B. Girshick, and Ali Farhadi. 2016. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'16). 779--788.
[55]
Iain E Richardson. 2004. H. 264 and MPEG-4 video compression: video coding for next-generation multimedia.
[56]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael S. Bernstein, Alexander C. Berg, and Fei-Fei Li. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision 115, 3 (2015), 211--252.
[57]
Khurram Soomro, Amir Roshan Zamir, and Mubarak Shah. 2012. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild. arXiv preprint arXiv:1212.0402 (2012).
[58]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott E. Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going Deeper with Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'15). 1--9.
[59]
Jo Yew Tham, Surendra Ranganath, Maitreya Ranganath, and Ashraf A Kassim. 1998. A novel unrestricted center-biased diamond search algorithm for block motion estimation. IEEE transactions on Circuits and Systems for Video Technology 8, 4 (1998), 369--377.
[60]
Ehsan Variani, Xin Lei, Erik McDermott, Ignacio Lopez-Moreno, and Javier Gonzalez-Dominguez. 2014. Deep Deural Detworks for Small Footprint Text-dependent Speaker Verification. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'14). 4052--4056.
[61]
Tri Vu, Feng Lin, Nabil Alshurafa, and Wenyao Xu. 2017. Wearable Food Intake Monitoring Technologies: A Comprehensive Review. Computers 6 (2017).
[62]
Yandong Wen, Kaipeng Zhang, Zhifeng Li, and Yu Qiao. 2016. A Discriminative Feature Learning Approach for Deep Face Recognition. In Proceedings of the 14th European Conference on Computer Vision (ECCV'16). 499--515.
[63]
Jiaxiang Wu, Cong Leng, Yuhang Wang, Qinghao Hu, and Jian Cheng. 2016. Quantized Convolutional Neural Networks for Mobile Devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (CVPR'16). 4820--4828.
[64]
Shuochao Yao, Shaohan Hu, Yiran Zhao, Aston Zhang, and Tarek F. Abdelzaher. 2017. DeepSense: A Unified Deep Learning Framework for Time-Series Mobile Sensing Data Processing. In Proceedings of the 26th International Conference on World Wide Web, (WWW'17). 351--360.
[65]
Shuochao Yao, Yiran Zhao, Aston Zhang, Lu Su, and Tarek Abdelzaher. 2017. DeepIoT: Compressing deep neural network structures for sensing systems with a compressor-critic framework. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems. ACM.
[66]
Xiao Zeng, Kai Cao, and Mi Zhang. 2017. MobileDeepPill: A Small- Footprint Mobile Deep Learning System for Recognizing Unconstrained Pill Images. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys'17). 56--67.
[67]
Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, and Jason Cong. 2015. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks. In Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA'15). 161--170.
[68]
Shan Zhu and Kai-Kuang Ma. 1997. A new diamond search algorithm for fast block matching motion estimation. In Proceedings of the International Conference on Information, Communications and Signal Processing (ICICS'97). 292--296.
[69]
Yanzi Zhu, Yuanshun Yao, Ben Y. Zhao, and Haitao Zheng. 2017. Object Recognition and Navigation using a Single Networking Device. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys'17). 265--277.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MobiCom '18: Proceedings of the 24th Annual International Conference on Mobile Computing and Networking
October 2018
884 pages
ISBN:9781450359030
DOI:10.1145/3241539
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cache
  2. deep learning
  3. mobile vision

Qualifiers

  • Research-article

Funding Sources

Conference

MobiCom '18
Sponsor:

Acceptance Rates

MobiCom '18 Paper Acceptance Rate 42 of 187 submissions, 22%;
Overall Acceptance Rate 440 of 2,972 submissions, 15%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)206
  • Downloads (Last 6 weeks)20
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

EPUB

View this article in ePub.

ePub

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media