research-article

Public Access

Race-to-sleep + content caching + display caching: a recipe for energy-efficient video streaming on handhelds

Authors:

Prasanna Venkatesh Rengasamy,

Nachiappan Chidambaram Nachiappan,

Anand Sivasubramaniam,

Mahmut T. Kandemir,

Chita R. DasAuthors Info & Claims

MICRO-50 '17: Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture

Pages 517 - 531

https://doi.org/10.1145/3123939.3123948

Published: 14 October 2017 Publication History

Abstract

Video streaming has become the most common application in handhelds and this trend is expected to grow in future to account for about 75% of all mobile data traffic by 2021. Thus, optimizing the performance and energy consumption of video processing in mobile devices is critical for sustaining the handheld market growth. In this paper, we propose three complementary techniques, race-to-sleep, content caching and display caching, to minimize the energy consumption of the video processing flows. Unlike the state-of-the-art frame-by-frame processing of a video decoder, the first scheme, race-to-sleep, uses two approaches, called batching of frames and frequency boosting to prolong its sleep state for saving energy, while avoiding any frame drops. The second scheme, content caching, exploits the content similarity of smaller video blocks, called macroblocks, to design a novel cache organization for reducing the memory pressure. The third scheme, in turn, takes advantage of content similarity at the display controller to facilitate display caching further improving energy efficiency. We integrate these three schemes for developing an end-to-end video processing framework and evaluate our design on a comprehensive mobile system design platform with a variety of video processing workloads. Our evaluations show that the proposed three techniques complement each other in improving performance by avoiding frame drops and reducing the energy consumption of video streaming applications by 21%, on average, compared to the current baseline design.

References

[1]

4K SAMPLES. 2012. Puppies Bath in 4K. https://goo.gl/sbq8AD. (2012). Accessed: 2017-03-15.

[2]

4K SAMPLES. 2015. Honey Bees 96fps In 4K (ULTRA HD). https://goo.gl/u0pTz9. (2015). Accessed: 2017-03-15.

[3]

4K SAMPLES. 2017. 4K Gaming Montage. https://goo.gl/qJExOF. (2017). Accessed: 2017-03-15.

[4]

Susanne Albers and Antonios Antoniadis. 2014. Race to Idle: New Algorithms for Speed Scaling with a Sleep State. ACM Trans. Algorithms (2014), 9:1--9:31.

Digital Library

[5]

AMD. 2016. RADEON: Dissecting the Polaris Architecture. AMD Whitepaper (2016).

[6]

Pablo Ameigeiras, Juan J. Ramos-Munoz, Jorge Navarro-Ortiz, and J.M. Lopez-Soler. 2012. Analysis and modelling of YouTube traffic. Transactions on Emerging Telecommunications Technologies (2012).

[7]

Antonios Antoniadis, Chien-Chung Huang, and Sebastian Ott. 2015. A Fully Polynomial-time Approximation Scheme for Speed Scaling with Sleep State. In Proceedings of the Twenty-sixth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 1102--1113.

Digital Library

[8]

ARM. 2016. ARM Frame Buffer Compression. https://goo.gl/ETnZTF. (2016).

[9]

ARM. 2016. Transaction Elimination. https://goo.gl/91cDSG. (2016).

[10]

ARM. 2017. big LITTLE Technology. https://www.arm.com/products/processors/technologies/biglittleprocessing.php. (2017).

[11]

ARM. 2017. NEON. https://www.arm.com/products/processors/technologies/neon.php. (2017).

[12]

Arnaldo Azevedo and Ben Juurlink. 2009. An Efficient Software Cache for H.264 Motion Compensation. In Proceedings of the 11th International Conference on System-on-chip (SOC). 147--150.

Digital Library

[13]

Peter Bailis, Vijay Janapa Reddi, Sanjay Gandhi, David Brooks, and Margo Seltzer. 2011. Dimetrodon: Processor-level Preventive Thermal Management via Idle Cycle Injection. In Proceedings of the 48th Design Automation Conference (DAC). 89--94.

Digital Library

[14]

Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The Gem5 Simulator. SIGARCH Comput. Archit. News (2011), 1--7.

Digital Library

[15]

James H Burrows. 1995. Secure Hash Standard. Technical Report. DTIC Document.

[16]

Aaron Carroll and Gernot Heiser. 2010. An Analysis of Power Consumption in a Smartphone. In Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference.

Digital Library

[17]

Tung-Chien Chen, Yu-Wen Huang, and Liang-Gee Chen. 2004. Analysis and Design of Macroblock Pipelining for H.264/AVC VLSI Architecture. In Proceedings of 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512). II-273--6 Vol.2.

[18]

X. Chen, Peilin Liu, Jiayi Zhu, Dajiang Zhou, and S. Goto. 2009. Block-pipelining Cache for Motion Compensation in High Definition H.264/AVC Video Decoder. In Proceedings of 2009 IEEE International Symposium on Circuits and Systems (ISCAS). 1069--1072.

[19]

Nachiappan Chidambaram Nachiappan, Praveen Yedlapalli, Niranjan Soundararajan, Mahmut Taylan Kandemir, Anand Sivasubramaniam, and Chita R. Das. 2014. GemDroid: A Framework to Evaluate Mobile Platforms. In Proceedings of the 2014 ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS). 355--366.

Digital Library

[20]

Christopher Nolan. 2014. Interstellar Movie - Official Trailer. https://goo.gl/I4bpGF. (2014). Accessed: 2017-03-15.

[21]

Tzu-Der Chuang, Lo-Mei Chang, Tsai-Wei Chiu, Yi-Hau Chen, and L. G. Chen. 2009. Bandwidth-efficient Cache-based Motion Compensation Architecture with DRAM-friendly Data Access Control. In Proceedings of 2009 IEEE International Conference on Acoustics, Speech and Signal Processing (CASSP). 2009--2012.

Digital Library

[22]

CISCO. 2017. Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, 2016 - 2021 White Paper. "http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/mobile-white-paper-c11-520862.html". (2017).

[23]

L. Codrescu, W. Anderson, S. Venkumanhanti, M. Zeng, E. Plondke, C. Koob, A. Ingle, C. Tabony, and R. Maule. 2014. Hexagon DSP: An Architecture Optimized for Mobile Multimedia and Communications. IEEE Micro (2014), 34--43.

[24]

Anup Das, Geoff V. Merrett, and Bashir M. Al-Hashimi. 2016. The Slowdown or Race-to-idle Question: Workload-aware Energy Optimization of SMT Multicore Platforms Under Process Variation. In Proceedings of 2016 Design, Automation Test in Europe Conference Exhibition (DATE). 535--538.

Digital Library

[25]

David Fincher. 2014. Gone Girl Official 2 (2014) Ben Affleck, Rosamund Pike HD. https://goo.gl/EBHqmQ. (2014). Accessed: 2017-03-15.

[26]

K.C. Dyke. 2013. Method for Reducing Framebuffer Memory Accesses. (2013). http://www.google.tl/patents/US8358314 US Patent 8,358,314.

[27]

FFmpeg. 2016. A Complete, Cross-platform Solution to Record, Convert and Stream Audio and Video. "https://ffmpeg.org/". (2016).

[28]

Nadeem Firasta, Mark Buxton, Paula Jinbo, Kaveh Nasri, and Shihjong Kuo. 2008. Intel AVX: New frontiers in performance improvements and energy efficiency. Intel white paper (2008).

[29]

Google. 2016. BoardConfig.mk. https://android.googlesource.com/device/google/marlin/+/android-7.1.1_r6/marlin/BoardConfig.mk. (2016).

[30]

Google. 2016. FramebufferSurface.cpp. https://android.googlesource.com/platform/frameworks/native/+/android-7.1.1_r28/services/surfaceflinger/DisplayHardware/FramebufferSurface.cpp. (2016).

[31]

Google. 2016. SurfaceFlinger and Hardware Composer. https://source.android.com/devices/graphics/arch-sf-hwc.html. (2016).

[32]

Google. 2017. Android Gralloc Framebuffer. "https://goo.gl/TMuNFU". (2017).

[33]

Google. 2017. YouTube for Press. "https://www.youtube.com/yt/about/press/". (2017).

[34]

MyungJoo Ham, Inki Dae, and Chanwoo Choi. 2015. LPD: Low Power Display Mechanism for Mobile and Wearable Devices. In Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference. 587--598.

Digital Library

[35]

Kyungtae Han, Zhen Fang, Paul Diefenbaugh, Richard Forand, Ravi R. Iyer, and Donald Newell. 2009. Using Checksum to Reduce Power Consumption of Display Systems for Low-motion Content. In Proceedings of the 2009 IEEE International Conference on Computer Design (ICCD). 47--53.

Digital Library

[36]

Kyungtae Han, Alexander W. Min, Nithyananda S. Jeganathan, and Paul S. Diefenbaugh. 2013. A Hybrid Display Frame Buffer Architecture for Energy Efficient Display Subsystems. In Proceedings of the 2013 International Symposium on Low Power Electronics and Design (ISLPED).

Digital Library

[37]

P.F. Holland, H.G.R. Thirunageswaram, and J.J. Irwin. 2016. Display Pipe Line Buffer Sharing. (2016). https://www.google.com/patents/US20160086298 US Patent App. 14/493,755.

[38]

C. T. Huang, M. Tikekar, C. Juvekar, V. Sze, and A. Chandrakasan. 2013. A 249Mpixel/s HEVC Video-decoder Chip for Quad Full HD Applications. In 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers. 162--163.

[39]

Aamer Jaleel. 2010. Memory Characterization of Workloads Using Instrumentation-driven Simulation. Web Copy: http://www.jaleels.org/ajaleel/workload (2010).

[40]

Min Kyu Jeong, Mattan Erez, Chander Sudanthi, and Nigel Paver. 2012. A QoS-aware Memory Controller for Dynamically Balancing GPU and CPU Bandwidth Use in an MPSoC. In Proceedings of the 49th Annual Design Automation Conference (DAC). 850--855.

Digital Library

[41]

G. Jin, J. S. Jung, and H. J. Lee. 2007. An Efficient Pipelined Architecture for H.264/AVC Intra Frame Processing. In Proceedings of 2007 IEEE International Symposium on Circuits and Systems. 1605--1608.

[42]

Adwait Jog, Onur Kayiran, Tuba Kesten, Ashutosh Pattnaik, Evgeny Bolotin, Nilardish Chatterjee, Steve Keckler, Mahmut T. Kandemir, and Chita R. Das. 2015. Anatomy of GPU Memory System for Multi-Application Execution. In Proceedings of the 2015 International Symposium on Memory Systems (MEMSYS). 223--234.

Digital Library

[43]

C. C. Ju, T. M. Liu, K. B. Lee, Y. C. Chang, H. L. Chou, C. M. Wang, T. H. Wu, H. M. Lin, Y. H. Huang, C. Y. Cheng, T. A. Lin, C. C. Chen, Y. K. Lin, M. H. Chiu, W. C. Li, S. J. Wang, Y. C. Lai, P. Chao, C. D. Chien, M. J. Hu, P. H. Wang, Y. C. Huang, S. H. Chuang, L. F. Chen, H. Y. Lin, M. L. Wu, and C. H. Chen. 2016. A 0.5 nJ/Pixel 4 K H.265/HEVC Codec LSI for Multi-Format Smartphone Applications. IEEE Journal of Solid-State Circuits (2016).

[44]

U. J. Kapasi, W. J. Dally, S. Rixner, J. D. Owens, and B. Khailany. 2002. The Imagine Stream Processor. In Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02). 282-.

Digital Library

[45]

Onur Kayiran, Adwait Jog, Ashutosh Pattnaik, Rachata Ausavarungnirun, Xulong Tang, Mahmut T. Kandemir, Gabriel H. Loh, Onur Mutlu, and Chita R. Das. 2016. uC-States: Fine-grained GPU Datapath Power Management. In Proceedings of 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT). 17--30.

Digital Library

[46]

J. H. Kim, G. H. Hyun, and H. J. Lee. 2007. Cache Organizations for H.264/AVC Motion Compensation. In Proceedings of 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2007). 534--541.

Digital Library

[47]

Marios Kleanthous and Yiannakis Sazeides. 2011. CATCH: A Mechanism for Dynamically Detecting Cache-content-duplication in Instruction Caches. ACM Trans. Archit. Code Optim. (2011).

Digital Library

[48]

Jagadish B. Kotra, Narges Shahidi, Zeshan A. Chishti, and Mahmut T. Kandemir. 2017. Hardware-Software Co-design to Mitigate DRAM Refresh Overheads: A Case for Refresh-Aware Process Scheduling. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 723--736.

Digital Library

[49]

Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 190--200.

Digital Library

[50]

Paul McKenney. 2012. A big.LITTLE scheduler update. https://lwn.net/Articles/501501/. (2012).

[51]

Hongyu Miao and Felix Xiaozhu Lin. 2016. Tell Your Graphics Stack That the Display Is Circular. In Proceedings of the 17th International Workshop on Mobile Computing Systems and Applications.

Digital Library

[52]

MICRON. 2014. Production Data Sheet: 8Gb, 16Gb: 253-Ball, Dual-Channel 2C0F Mobile LPDDR3 SDRAM (pdf). "https://www.micron.com/resource-details/75340edb-6b8e-4c43-968a-2323d5127aa6". (2014).

[53]

Joshua San Miguel, Jorge Albericio, Andreas Moshovos, and Natalie Enright Jerger. 2015. DoppelgÄNger: A Cache for Approximate Computing. In Proceedings of the 48th International Symposium on Microarchitecture (MICRO). 50--61.

Digital Library

[54]

Thomas Moscibroda and Onur Mutlu. 2007. Memory Performance Attacks: Denial of Memory Service in Multi-core Systems. In Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium.

Digital Library

[55]

D. Mukherjee, J. Bankoski, A. Grange, J. Han, J. Koleszar, P. Wilkins, Y. Xu, and R. Bultje. 2013. The Latest Open-source Video Codec VP9 - An Overview and Preliminary Results. In 2013 Picture Coding Symposium (PCS).

[56]

Naveen Muralimanohar, Rajeev Balasubramonian, and Norman P Jouppi. 2009. CACTI 6.0: A Tool to Model Large Caches. HP Laboratories (2009).

[57]

N. C. Nachiappan, P. Yedlapalli, N. Soundararajan, A. Sivasubramaniam, M. T. Kandemir, R. Iyer, and C. R. Das. 2015. Domain Knowledge Based Energy Management in Handhelds. In Proceedings of 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA). 150--160.

[58]

Nachiappan Chidambaram Nachiappan, Haibo Zhang, Jihyun Ryoo, Niranjan Soundararajan, Anand Sivasubramaniam, Mahmut T. Kandemir, Ravi Iyer, and Chita R. Das. 2015. VIP: Virtualizing IP Chains on Handheld Platforms. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA). 655--667.

Digital Library

[59]

NASA. 2015. Ultra High Definition Video from the International Space Station. https://archive.org/details/NASA-Ultra-High-Definition. (2015). Accessed: 2017-03-15.

[60]

Neill Blomkamp. 2013. Elysium 2013 2160p 1 Minute Sample Footage. https://goo.gl/E6QJ0V. (2013). Accessed: 2017-03-15.

[61]

Nvidia. 2016. NVIDIA GeDorce GTX 1080. Nvidia Whitepaper (2016).

[62]

Ashutosh Pattnaik, Xulong Tang, Adwait Jog, Onur Kayiran, Asit K. Mishra, Mahmut T. Kandemir, Onur Mutlu, and Chita R. Das. 2016. Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities. In Proceedings of the 2016 International Conference on Parallel Architectures and Compilation (PACT). 31--44.

Digital Library

[63]

Indrani Paul, Srilatha Manne, Manish Arora, W. Lloyd Bircher, and Sudhakar Yalamanchili. 2013. Cooperative Boosting: Needy Versus Greedy Power Management. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA). 285--296.

Digital Library

[64]

Gennady Pekhimenko, Vivek Seshadri, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. 2012. Base-delta-immediate Compression: Practical Data Compression for On-chip Caches. In Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques (PACT). 377--388.

Digital Library

[65]

W. W. Peterson and D. T. Brown. 1961. Cyclic Codes for Error Detection. Proceedings of the IRE (1961).

[66]

Erwan Raffin, Erwan Nogues, Wassim Hamidouche, Seppo Tomperi, Maxime Pelcat, and Daniel Menard. 2016. Low Power HEVC Software Decoder for Mobile Devices. Journal of Real-Time Image Processing (2016).

Digital Library

[67]

Arun Raghavan, Laurel Emurian, Lei Shao, Marios Papaefthymiou, Kevin P. Pipe, Thomas F. Wenisch, and Milo M.K. Martin. 2013. Computational Sprinting on a Hardware/Software Testbed. Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2013), 155--166.

Digital Library

[68]

Arun Raghavan, Yixin Luo, Anuj Chandawalla, Marios Papaefthymiou, Kevin P. Pipe, Thomas F. Wenisch, and Milo M. K. Martin. 2012. Computational Sprinting. In Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture (HPCA). 1--12.

Digital Library

[69]

J. Rasmusson, T. Akenine-Möller, J. Hasselgren, and J. Munkberg. 2011. Frame Buffer Compression and Decompression Method for Graphics Rendering. (2011). https://www.google.com/patents/US8031937 US Patent 8,031,937.

[70]

Prasanna Venkatesh Rengasamy and Madhu Mutyam. 2014. Using Packet Information for Efficient Communication in NoCs. In Networks-on-Chip (NoCS), 2014 Eighth IEEE/ACM International Symposium on. 143--150.

[71]

Prasanna Venkatesh Rengasamy, Anand Sivasubramaniam, Mahmut T. Kandemir, and Chita R. Das. 2015. Exploiting Staleness for Approximating Loads on CMPs. In Proceedings of the 2015 International Conference on Parallel Architecture and Compilation (PACT). 343--354.

Digital Library

[72]

Prasanna Venkatesh Rengasamy, Haibo Zhang, Nachiappan Chidhambaram Nachiappan, Shulin Zhao, Anand Sivasubramaniam, Mahmut Kandemir, and Chita R. Das. 2017. Characterizing Diverse Handheld apps for Customized Hardware Acceleration. In Proceedings of 2017 IEEE International Symposium on Workload Characterization (IISWC).

[73]

R. Rivest. 1992. The MD5 Message-Digest Algorithm. https://tools.ietf.org/html/rfc1321. (1992).

Digital Library

[74]

Scott Rixner, William J. Dally, Ujval J. Kapasi, Peter Mattson, and John D. Owens. 2000. Memory Access Scheduling. SIGARCH Comput. Archit. News (2000), 128--138.

Digital Library

[75]

Paul Rosenfeld, Elliott Cooper-Balis, and Bruce Jacob. 2011. DRAMSim2: A Cycle Accurate Memory System Simulator. IEEE Computer Architecture Letters (2011), 16--19.

Digital Library

[76]

Sam Mendes. 2012. Skyfall 4K (UHD). https://goo.gl/LgnZ15. (2012). Accessed: 2017-03-15.

[77]

SES Astra. 2016. SES Astra UHD Test 2160p UHDTV Free 4K Sample Footage. https://goo.gl/S7i4nN. (2016). Accessed: 2017-03-15.

[78]

Hojun Shim, Naehyuck Chang, and Massoud Pedram. 2004. A Compressed Frame Buffer to Reduce Display Power Consumption in Mobile Systems. In Proceedings of the 2004 Asia and South Pacific Design Automation Conference (ASPDAC). 818--823.

Digital Library

[79]

Akbar Shrifi, Wei Ding, Diana Guttman, Hui Zhao, Xulong Tang, Mahmut Kandemir, and Chita Das. 2017. DEMM: A Dynamic Energy-saving Mechanism for Multicore Memories. In Proceedings of 2017 IEEE 25th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS).

[80]

G. J. Sullivan, J. R. Ohm, W. J. Han, and T. Wiegand. 2012. Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Transactions on Circuits and Systems for Video Technology (2012).

Digital Library

[81]

K. Swaminathan, J. Kotra, H. Liu, J. Sampson, M. Kandemir, and V. Narayanan. 2015. Thermal-Aware Application Scheduling on Device-Heterogeneous Embedded Architectures. In 2015 28th International Conference on VLSI Design. 221--226.

[82]

Xulong Tang, Hong An, Gongjin Sun, and Dongrui Fan. 2013. A Video Coding Benchmark Suite for Evaluation of Processor Capability. In Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD). 101--116.

[83]

Xulong Tang, Orhan Kislal, Mahmut Kandemir, and Mustafa Karakoy. 2017. Data Movement Aware Computation Partitioning. In Proceedings of 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

Digital Library

[84]

P. Thinakaran, J. R. Gunasekaran, B. Sharma, M. T. Kandemir, and C. R. Das. 2017. Phoenix: A Constraint-Aware Scheduler for Heterogeneous Datacenters. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). 977--987.

[85]

Andrea Tilli, Andrea Bartolini, Matteo Cacciari, and Luca Benini. 2012. Don'T Burn Your Mobile!: Safe Computational Re-sprinting via Model Predictive Control. In Proceedings of the Eighth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis.

Digital Library

[86]

Tang-Hsun Tu and Chih-Wen Hsueh. 2010. Batch-Pipelining for H.264 Decoding on Multicore Systems. In Proceedings of the 2010 Data Compression Conference. 553-.

Digital Library

[87]

Hiroyuki Usui, Lavanya Subramanian, Kevin Kai-Wei Chang, and Onur Mutlu. 2016. DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators. ACM Trans. Archit. Code Optim. (2016).

Digital Library

[88]

Geert Uytterhoeve. 2001. The Frame Buffer Device. https://www.kernel.org/doc/Documentation/fb/framebuffer.txt. (2001).

[89]

T. Wiegand, G. J. Sullivan, G. Bjontegaard, and A. Luthra. 2003. Overview of the H.264/AVC Video Coding Standard. IEEE Transactions on Circuits and Systems for Video Technology (2003).

Digital Library

[90]

J. K. Wolf and D. Chun. 1994. The Single Burst Error Detection Performance of Binary Cyclic Codes. IEEE Transactions on Communications (1994), 11--13.

[91]

Jun Yang and Rajiv Gupta. 2002. Energy Efficient Frequent Value Data Cache Design. In Proceedings of the 35th Annual ACM/IEEE International Symposium on Microarchitecture (MICRO). 197--207.

Digital Library

[92]

Praveen Yedlapalli, Nachiappan Chidambaram Nachiappan, Niranjan Soundararajan, Anand Sivasubramaniam, Mahmut T. Kandemir, and Chita R. Das. 2014. Short-Circuiting Memory Traffic in Handheld Platforms. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 166--177.

Digital Library

[93]

Seyed Majid Zahedi, Songchun Fan, Matthew Faw, Elijah Cole, and Benjamin C. Lee. 2017. Computational Sprinting: Architecture, Dynamics, and Strategies. ACM Trans. Comput. Syst. (2017).

Digital Library

[94]

R. Zahir. 2012. Medfield smartphone SOC Intel Atom Z2460 processor. In 2012 IEEE Hot Chips 24 Symposium (HCS). 1--20.

[95]

Rumi Zahir, Mark Ewert, and Hari Seshadri. 2013. The Medfield Smartphone: Intel Architecture in a Handheld Form Factor. IEEE Micro (2013).

[96]

J. Zhan, M. Poremba, Y. Xu, and Y. Xie. 2014. NoD: Leveraging delta compression for end-to-end memory access in NoC based multicores. In Proceedings of 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC). 586--591.

[97]

Haibo Zhang, Wenting Han, Feng Li, Songtao He, Yichao Cheng, Hong An, and Zhitao Chen. 2014. A Criticality-Aware DVFS Runtime Utility for Optimizing Power Efficiency of Multithreaded Applications. In Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops (IPDPSW). 841--848.

Digital Library

[98]

Youtao Zhang, Jun Yang, and Rajiv Gupta. 2000. Frequent Value Locality and Value-centric Data Cache Design. In SIGPLAN Not. 150--159.

Digital Library

[99]

D. Zhou, S. Wang, H. Sun, J. Zhou, J. Zhu, Y. Zhao, J. Zhou, S. Zhang, S. Kimura, T. Yoshimura, and S. Goto. 2016. 14.7 A 4Gpixel/s 8/10b H.265/HEVC Video Decoder Chip for 8K Ultra HD Applications. In Proceedings of 2016 IEEE International Solid-State Circuits Conference (ISSCC).

[100]

D. Zhou, J. Zhou, X. He, J. Zhu, J. Kong, P. Liu, and S. Goto. 2011. A530 Mpixels/s 4096x2160 60fps H.264/AVC High Profile Video Decoder Chip. IEEE Journal of Solid-State Circuits (2011), 777--788.

[101]

Yuhao Zhu and Vijay Janapa Reddi. 2013. High-performance and Energy-efficient Mobile Web Browsing on Big/little Systems. In Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA). 13--24.

Digital Library

[102]

Yuhao Zhu and Vijay Janapa Reddi. 2017. Optimizing General-Purpose CPUs for Energy-Efficient Mobile Web Computing. ACM Trans. Comput. Syst. (2017), 1:1--1:31.

Digital Library

Cited By

Liu DQian CRong HZhou SXiang CJiang H(2024)Energy and QoE Optimization for Mobile Video Streaming with Adaptive Brightness ScalingACM Transactions on Sensor Networks10.1145/367099920:4(1-24)Online publication date: 8-Jul-2024
https://dl.acm.org/doi/10.1145/3670999
Lee SJeong DChoi JKwak JSon SSong JShin I(2024)SERENUS: Alleviating Low-Battery Anxiety Through Real-time, Accurate, and User-Friendly Energy Consumption Prediction of Mobile ApplicationsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676437(1-20)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676437
Nguyen TNguyen NKim CDao N(2023)Intelligent aerial video streaming: Achievements and challengesJournal of Network and Computer Applications10.1016/j.jnca.2022.103564211(103564)Online publication date: Feb-2023
https://doi.org/10.1016/j.jnca.2022.103564
Show More Cited By

Index Terms

Race-to-sleep + content caching + display caching: a recipe for energy-efficient video streaming on handhelds
1. Computer systems organization
  1. Architectures
  2. Embedded and cyber-physical systems
    1. Embedded systems
      1. Embedded hardware
2. Human-centered computing
  1. Ubiquitous and mobile computing
    1. Ubiquitous and mobile computing theory, concepts and paradigms
      1. Mobile computing

Recommendations

Selective Victim Caching: A Method to Improve the Performance of Direct-Mapped Caches

Although direct-mapped caches suffer from higher miss ratios as compared to set-associative caches, they are attractive for today's high-speed pipelined processors that require very low access times. Victim caching was proposed by Jouppi [1] as an ...
Caching to Reduce Mobile App Energy Consumption

Mobile applications consume device energy for their operations, and the fast rate of battery depletion on mobile devices poses a major usability hurdle. After the display, data communication is the second-biggest consumer of mobile device energy. At the ...
The evicted-address filter: a unified mechanism to address both cache pollution and thrashing
PACT '12: Proceedings of the 21st international conference on Parallel architectures and compilation techniques

Off-chip main memory has long been a bottleneck for system performance. With increasing memory pressure due to multiple on-chip cores, effective cache utilization is important. In a system with limited cache space, we would ideally like to prevent 1) ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MICRO-50 '17: Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture

October 2017

850 pages

ISBN:9781450349529

DOI:10.1145/3123939

General Chairs:
Hillery Hunter
IBM Research
,
Jaime Moreno
IBM Research
,
Program Chairs:
Joel Emer
NVIDIA and MIT
,
Daniel Sanchez
MIT

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
IEEE-CS\DATC: IEEE Computer Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 October 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

MICRO-50

Sponsor:

SIGMICRO
IEEE-CS\DATC

MICRO-50: The 50th Annual IEEE/ACM International Symposium on Microarchitecture

October 14 - 18, 2017

Massachusetts, Cambridge

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

24
Total Citations
View Citations
10,032
Total Downloads

Downloads (Last 12 months)8,407
Downloads (Last 6 weeks)46

Reflects downloads up to 03 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu DQian CRong HZhou SXiang CJiang H(2024)Energy and QoE Optimization for Mobile Video Streaming with Adaptive Brightness ScalingACM Transactions on Sensor Networks10.1145/367099920:4(1-24)Online publication date: 8-Jul-2024
https://dl.acm.org/doi/10.1145/3670999
Lee SJeong DChoi JKwak JSon SSong JShin I(2024)SERENUS: Alleviating Low-Battery Anxiety Through Real-time, Accurate, and User-Friendly Energy Consumption Prediction of Mobile ApplicationsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676437(1-20)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676437
Nguyen TNguyen NKim CDao N(2023)Intelligent aerial video streaming: Achievements and challengesJournal of Network and Computer Applications10.1016/j.jnca.2022.103564211(103564)Online publication date: Feb-2023
https://doi.org/10.1016/j.jnca.2022.103564
Qian CLiu DJiang H(2022)Harmonizing Energy Efficiency and QoE for Brightness Scaling-based Mobile Video Streaming2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)10.1109/IWQoS54832.2022.9812899(1-10)Online publication date: 10-Jun-2022
https://doi.org/10.1109/IWQoS54832.2022.9812899
Zhao SZhang HMishra CBhuyan SYing ZKandemir MSivasubramaniam ADas C(2021)HoloAR: On-the-fly Optimization of 3D Holographic Processing for Augmented RealityMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480056(494-506)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3466752.3480056
Xu YDas HGong YGong N(2020)On Mathematical Models of Optimal Video Memory DesignIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2018.289038330:1(256-266)Online publication date: Jan-2020
https://doi.org/10.1109/TCSVT.2018.2890383
Haj-Yahya JAlser MKim JOrosa LRotem EMendelson AChattopadhyay AMutlu O(2020)FlexWatts: A Power- and Workload-Aware Hybrid Power Delivery Network for Energy-Efficient Microprocessors2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00088(1051-1066)Online publication date: Oct-2020
https://doi.org/10.1109/MICRO50266.2020.00088
Feng YTian BXu TWhatmough PZhu Y(2020)Mesorasi: Architecture Support for Point Cloud Analytics via Delayed-Aggregation2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00087(1037-1050)Online publication date: Oct-2020
https://doi.org/10.1109/MICRO50266.2020.00087
Rengasamy PZhang HZhao SSivasubramaniam AKandemir MDas C(2020)Selective Event Processing for Energy Efficient Mobile Gaming with SNIP2020 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC50251.2020.00035(288-299)Online publication date: Oct-2020
https://doi.org/10.1109/IISWC50251.2020.00035
Haj-Yahya JSazeides YAlser MRotem EMutlu O(2020)Techniques for Reducing the Connected-Standby Energy Consumption of Mobile Devices2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA47549.2020.00057(623-636)Online publication date: Feb-2020
https://doi.org/10.1109/HPCA47549.2020.00057
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents