skip to main content
research-article
Open access

Significance-Aware Program Execution on Unreliable Hardware

Published: 28 April 2017 Publication History

Abstract

This article introduces a significance-centric programming model and runtime support that sets the supply voltage in a multicore CPU to sub-nominal values to reduce the energy footprint and provide mechanisms to control output quality. The developers specify the significance of application tasks respecting their contribution to the output quality and provide check and repair functions for handling faults. On a multicore system, we evaluate five benchmarks using an energy model that quantifies the energy reduction. When executing the least-significant tasks unreliably, our approach leads to 20% CPU energy reduction with respect to a reliable execution and has minimal quality degradation.

Supplementary Material

TACO1402-12 (taco1402-12.pdf)
Slide deck associated with this paper

References

[1]
Sara Achour and Martin C. Rinard. 2015. Approximate computation with outlier detection in topaz. In Proceedings of the ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications.
[2]
Woongki Baek and Trishul M. Chilimbi. 2010. Green: A framework for supporting energy-conscious programming using controlled approximation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation.
[3]
Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The parsec benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques.
[4]
David Blaauw, Sudherssen Kalaiselvan, Kevin Lai, Wei-Hsiang Ma, Sanjay Pant, Carlos Tokunaga, Shidhartha Das, and David M. Bull. 2008. Razor II: In situ error detection and correction for PVT and SER tolerance. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC), Digest of Technical Papers.
[5]
Jeremy Constantin, Lai Wang, Georgios Karakonstantis, Anupam Chattopadhyay, and Andreas Burg. 2015. Exploiting dynamic timing margins in microprocessors for frequency-over-scaling with instruction-based clock adjustment. In Proceedings of the Design, Automation 8 Test in Europe Conference 8 Exhibition.
[6]
Shidhartha Das, David Roberts, Seokwoo Lee, Sanjay Pant, David Blaauw, Todd Austin, Krisztián Flautner, and Trevor Mudge. 2006. A self-tuning DVS processor using delay-error detection and correction. IEEE J. Solid-State Circ. 41, 4 (2006).
[7]
Marc de Kruijf, Shuou Nomura, and Karthikeyan Sankaralingam. 2010. Relax: An architectural framework for software recovery of hardware faults. In Proceedings of the 37th International Symposium on Computer Architecture.
[8]
Arnaud Doucet, Simon Godsill, and Christophe Andrieu. 2000. On sequential monte carlo sampling methods for bayesian filtering. Stat. Comput. 10, 3 (2000), 197--208.
[9]
Dan Ernst, Nam Sung Kim, Shidhartha Das, Sanjay Pant, Rajeev Rao, Toan Pham, Conrad Ziesler, David Blaauw, Todd Austin, Krisztian Flautner, and Trevor Mudge. 2003. Razor: A low-power pipeline based on circuit-level timing speculation. In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture.
[10]
Hadi Esmaeilzadeh, Adrian Sampson, Luis Ceze, and Doug Burger. 2012. Architecture support for disciplined approximate programming. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems.
[11]
Íñigo Goiri, Ricardo Bianchini, Santosh Nagarakatte, and Thu D. Nguyen. 2015. ApproxHadoop: Bringing approximations to mapreduce frameworks. In Proceedings of the 22th International Conference on Architectural Support for Programming Languages and Operating Systems.
[12]
Meeta S. Gupta, Jude A. Rivers, Pradip Bose, Gu-Yeon Wei, and David Brooks. 2009. Tribeca: Design for PVT variations with local recovery and fine-grained adaptation. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture.
[13]
Giang Hoang, Robby Bruce Findler, and Russ Joseph. 2011. Exploring circuit timing-aware language and compilation. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems.
[14]
Ravishankar K. Iyer, Nithin M. Nakka, Zbigniew T. Kalbarczyk, and Subhasish Mitra. 2005. Recent advances and new avenues in hardware-level reliability support. IEEE Micro 25, 6 (2005).
[15]
Norman James, Phillip Restle, Joshua Friedrich, Bill Huott, and Bradley McCredie. 2007. Comparison of split-versus connected-core supplies in the POWER6 microprocessor. In Proceedings of the 2007 IEEE International. Solid-State Circuits Conference. Digest of Technical Papers.
[16]
Edin Kadric, Kunal Mahajan, and André DeHon. 2014. Energy reduction through differential reliability and lightweight checking. In Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.
[17]
Daya S. Khudia, Babak Zamirai, Mehrzad Samadi, and Scott Mahlke. 2015. Rumba: An online quality management system for approximate computing. SIGARCH Comput. Archit. News 43, 3 (2015).
[18]
Michael A. Laurenzano, Parker Hill, Mehrzad Samadi, Scott Mahlke, Jason Mars, and Lingjia Tang. 2016. Input responsiveness: Using canary inputs to dynamically steer approximation. In Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation.
[19]
Larkhoon Leem, Hyungmin Cho, Jason Bau, Quinn A. Jacobson, and Subhasish Mitra. 2010. ERSA: Error resilient system architecture for probabilistic applications. In Proceedings of the Conference on Design, Automation and Test in Europe.
[20]
Régis Leveugle, A. Calvez, Paolo Maistri, and Pierre Vanhauwaert. 2009. Statistical fault injection: Quantified error and confidence. In Proceedings of the 2009 Design, Automation 8 Test in Europe Conference 8 Exhibition.
[21]
Abdelhafid Mazouz, Alexandre Laurent, Benoît Pradelle, and William Jalby. 2014. Evaluation of CPU frequency transition latency. Comput. Sci. 29, 3--4 (2014), 187--195.
[22]
Sasa Misailovic, Michael Carbin, Sara Achour, Zichao Qi, and Martin C. Rinard. 2014. Chisel: Reliability- and accuracy-aware optimization of approximate computational kernels. SIGPLAN Not. 49, 10 (2014).
[23]
Ramon E. Moore, R. Baker Kearfott, and Michael J. Cloud. 2009. Introduction to Interval Analysis (1st ed.). Society for Industrial and Applied Mathematics.
[24]
K. Parasyris, G. Tziantzoulis, C. D. Antonopoulos, and N. Bellas. 2014. GemFI: A fault injection tool for studying the behavior of applications on unreliable substrates. In Proceedings of the 2014 44th Annual IEEE/IFIP Int. Conference on Dependable Systems and Networks (DSN).
[25]
Abbas Rahimi, Luca Benini, and Rajesh K. Gupta. 2012. Analysis of instruction-level vulnerability to dynamic voltage and temperature variations. In Proceedings of the 2012Design, Automation 8 Test in Europe Conference 8 Exhibition (DATE).
[26]
Abbas Rahimi, Andrea Marongiu, Paolo Burgio, Rajesh K. Gupta, and Luca Benini. 2013. Variation-tolerant OpenMP tasking on tightly-coupled processor clusters. In Proceedings of the Conference on Design, Automation and Test in Europe.
[27]
Vijay Janapa Reddi, Svilen Kanev, Wonyoung Kim, Simone Campanoni, Michael D. Smith, Gu-Yeon Wei, and David Brooks. 2010. Voltage smoothing: Characterizing and mitigating voltage noise in production processors via software-guided thread scheduling. In Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[28]
Semeen Rehman, Kuan-Hsun Chen, Florian Kriebel, Anas Toma, Muhammad Shafique, Jian-Jia Chen, and Jörg Henkel. 2016. Cross-layer software dependability on unreliable hardware. IEEE Trans. Comput. 65, 1 (2016), 80--94.
[29]
Semeen Rehman, Muhammad Shafique, Florian Kriebel, and Jörg Henkel. 2011. Reliable software for unreliable hardware: Embedded code generation aiming at reliability. In Proceedings of the 7th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis.
[30]
Martin Rinard. 2006. Probabilistic accuracy bounds for fault-tolerant computations that discard tasks. In ICS’06. 324--334.
[31]
Michael Ringenburg, Adrian Sampson, Isaac Ackerman, Luis Ceze, and Dan Grossman. 2015. Monitoring and debugging the quality of results in approximate programs. In Proceedings of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems.
[32]
Mohammad Salehi, Mohammad Khavari Tavana, Semeen Rehman, Florian Kriebel, Muhammad Shafique, Alireza Ejlali, and Jörg Henkel. 2015. DRVS: Power-efficient reliability management through dynamic redundancy and voltage scaling under variations. In Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design.
[33]
Adrian Sampson, Werner Dietl, Emily Fortuna, Danushen Gnanapragasam, Luis Ceze, and Dan Grossman. 2011. EnerJ: Approximate data types for safe and general low-power computation. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation.
[34]
Florian Schmoll, Andreas Heinig, Peter Marwedel, and Michael Engel. 2013. Improving the fault resilience of an H.264 decoder using static analysis methods. ACM Trans. Embedded Comput. Syst. 13, 1s (2013).
[35]
Stelios Sidiroglou-Douskos, Sasa Misailovic, Henry Hoffmann, and Martin Rinard. 2011. Managing performance vs. accuracy trade-offs with loop perforation. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering.
[36]
Joseph Sloan, John Sartori, and Rakesh Kumar. 2012. On software design for stochastic processors. In Proceedings of the 49th Annual Design Automation Conference.
[37]
Jan Treibig, Georg Hager, and Gerhard Wellein. 2010. LIKWID: A lightweight performance-oriented tool suite for x86 multicore environments. In Proceedings of the 2010 39th International Conference on Parallel Processing Workshops.
[38]
G. Tziantzioulis, A. M. Gok, S. M. Faisal, N. Hardavellas, S. Ogrenci-Memik, and S. Parthasarathy. 2015. b-HiVE: A bit-level history-based error model with value correlation for voltage-scaled integer and floating point units. In Proceedings of the 52nd Annual Design Automation Conference.
[39]
Vassilis Vassiliadis, Charalampos Chalios, Konstantinos Parasyris, Christos D. Antonopoulos, Spyros Lalis, Nikolaos Bellas, Hans Vandierendonck, and Dimitrios S. Nikolopoulos. 2015. A significance-driven programming framework for energy-constrained approximate computing. In Proceedings of the 12th ACM International Conference on Computing Frontiers.
[40]
Vassilis Vassiliadis, Jan Riehme, Jens Deussen, Konstantinos Parasyris, Christos D. Antonopoulos, Nikolaos Bellas, Spyros Lalis, and Uwe Naumann. 2016. Towards automatic significance analysis for approximate computing. In Proceedings of the Internationla Symposium on Code Generation and Optimization.
[41]
Foivos S. Zakkak, Dimitrios Chasapis, Polyvios Pratikakis, Angelos Bilas, and Dimitrios S. Nikolopoulos. 2012. Inference and declaration of independence: Impact on deterministic task parallelism. In Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques.
[42]
Qian Zhang, Feng Yuan, Rong Ye, and Qiang Xu. 2014. ApproxIt: An approximate computing framework for iterative methods. In Proceedings of the 51st Annual Design Automation Conference 2014 (DAC’14).

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization  Volume 14, Issue 2
June 2017
259 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/3086564
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 April 2017
Accepted: 01 February 2017
Revised: 01 February 2017
Received: 01 August 2016
Published in TACO Volume 14, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Significance aware computing
  2. energy efficiency
  3. quality of output
  4. unreliable hardware

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)67
  • Downloads (Last 6 weeks)11
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)HPAC-ML: A Programming Model for Embedding ML Surrogates in Scientific ApplicationsProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00078(1-16)Online publication date: 17-Nov-2024
  • (2023)ARETE: Accurate Error Assessment via Machine Learning-Guided Dynamic-Timing AnalysisIEEE Transactions on Computers10.1109/TC.2022.319196672:4(1026-1040)Online publication date: 1-Apr-2023
  • (2022)Instruction-aware Learning-based Timing Error Models through Significance-driven Approximations2022 IEEE 40th International Conference on Computer Design (ICCD)10.1109/ICCD56317.2022.00074(455-462)Online publication date: Oct-2022
  • (2021)HPACProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3458817.3476216(1-14)Online publication date: 14-Nov-2021
  • (2021)Exploring the potential of context-aware dynamic CPU undervoltingProceedings of the 18th ACM International Conference on Computing Frontiers10.1145/3457388.3458658(73-82)Online publication date: 11-May-2021
  • (2021)Boosting Microprocessor Efficiency: Circuit- and Workload-Aware Assessment of Timing Errors2021 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC53511.2021.00022(125-137)Online publication date: Nov-2021
  • (2018)A Framework for Evaluating Software on Reduced Margins Hardware2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN.2018.00043(330-337)Online publication date: Jun-2018

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media