skip to main content
research-article

Workload assignment considering NBTI degradation in multicore systems

Published: 13 January 2014 Publication History

Abstract

With continuously shrinking technology, reliability issues such as Negative Bias Temperature Instability (NBTI) has resulted in considerable degradation of device performance, and eventually the short mean-time-to-failure (MTTF) of the whole multicore system. This article proposes a new workload balancing scheme based on device-level fractional NBTI model to balance the workload among active cores while relaxing stressed ones. Starting with NBTI-induced threshold voltage degradation, we define a concept of Capacity Rate (CR) as an indication of one core's ability to accept workload. Capacity rate captures core's performance variability in terms of delay and power metrics under the impact of NBTI aging. The proposed workload balancing framework employs the capacity rates as workload constraints, applies a Dynamic Zoning (DZ) algorithm to group cores into zones to process task flows, and then uses Dynamic Task Scheduling (DTS) to allocate tasks in each zone with balanced workload and minimum communication cost. Experimental results on a 64-core system show that by allowing a small part of the cores to relax over a short time period, the proposed methodology improves multicore system yield (percentage of core failures) by 20%, while extending MTTF by 30% with insignificant degradation in performance (less than 3%).

References

[1]
Abella, J., Vera, X., and Gonzalez, A. 2007. Penelope: The NBTI-Aware Processor. In Proceedings of International Symposium on Microarchitecture. 85--96.
[2]
Alam, M. and Mahapatra, S. 2008. A comprehensive model of PMOS NBTI degradation. Microelectron. Reliab. 45, 1, 71--81.
[3]
Basoglu, M., Orshansky, M., and Erez, M. 2010. NBTI-aware DVFS: A new approach to saving energy and increasing processor lifetime. In Proceedings of ISPLED. 253--248.
[4]
Bhardwaj, S., Wang, W., Vttikonda, R., Cao, Y., and Vrudhula, S. 2006. Predictive modeling of the NBTI effect for reliable design. In Proceedings of CICC. 189--192.
[5]
Bild, D., Bok, G., and Dick, R. 2009. Minimization of NBTI performance degradation using internal node control. In Proceedings of DATE. 148--153.
[6]
Chen, G., Chuah, K. Y., Li, M. F., Chan, D. S., Ang, C. H., Zheng, J. Z., Jin, Y., and Kwong, D. L. 2003. Dynamic NBTI of pmos transistors and its impact on device lifetime. In Proceedings of IRPS. 196--202.
[7]
Constantinides, K., Plaza, S., Blome, J., Bertacco, V., Mahlke, S., Austin, T., Zhang, B., and Orshansky, M. 2007. Architecting a reliable CMP switch architecture. ACM Trans. Architect. Code Optimizat. 4, 1, 1--37.
[8]
Coskun, A. K., Rosing, T. S., and Whisnan, K. 2007. Temperature Aware Task Scheduling in MPSoCs. In Proceedings of DATE. 1--6.
[9]
El-Rewini, H., Lewis, T. G., and Ali, H. H. 1994. Task Scheduling in Parallel and Distributed Systems. Prentice Hall.
[10]
Fischetti, M. and Lodi, A. 2003. Local branching. Math. Prog. 98, 1--3, 23--47.
[11]
Greskamp, B., Sarangi, S. R., and Torrellas, J. 2007. Threshold voltage variation effects on aging-related hard failure rates. In Proceedings of ISCAS. 1261--1264.
[12]
Guthaus, M. R., Ringenberg, J. S., Ernst, D., Austin, T. M., Mudge, T., and Brown, R. B. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of IEEE International Workshop on Workload Characterization. 3--14.
[13]
Hung, W.-L., Xie, Y., Vijaykrishnan, N., Kandemir, M., and Irwin, M. J. 2005. Thermal-aware task allocation and scheduling for embedded systems. In Proceedings of DATE. 898--899.
[14]
Lee, C., Potkonjak, M., and Mangione-Smith, W.-H. 2008. MediaBench: A tool for evaluating and synthesizing multimedia and communications systems. In Proceedings of MICRO. 330--335.
[15]
Lee, E.-A. and Messerschmitt, D. G. 1987. Synchronous data flow. Proc. IEEE 75, 9, 1235--1245.
[16]
Lin, C.-H., Lin, I.-C., and Li, K.-H. 2011. TG-based technique for NBTI degradation and leakage optimization. In Proceedings of ISPLED. 133--138.
[17]
Memik, G., Mangione-Smith, W. H., and Hu, W. 2001. NetBench: A benchmarking suite for network processors. In Proceedings of ICCAD. 39--42.
[18]
Papoulis, A. 2002. Probability, Random Variables and Stochastic Processes. McGraw-Hill, New York.
[19]
Paul, B. C., Kang, K., Kufluoglu, H., Alam, M. A., and Roy, K. 2005. Impact of NBTI on the temporal performance degradation of digital circuits. IEEE Electron Dev. Lett. 26, 8, 560--562.
[20]
Reddy, V., Krishnan, A. T., Marshall, A., Rodriguez, J., Natarajan, S., Rost, T., and Krishnan, S. 2002. Impact of negative bias temperature instability on digital circuit reliability. In Proceedings of IRPS. 248--254.
[21]
Rong, P. and Pedram, M. 2006. Power-aware scheduling and dynamic voltage setting for tasks running on a hard real-time system. In Proceedings of ASPDAC. 473--478.
[22]
Ruggiero, M., Guerri, A., Bertozzi, D., Poletti, F., and Milano, M. 2006. Communication aware allocation and scheduling framework for stream-oriented multi-processor system-on-chip. In Proceedings of DATE. 3--8.
[23]
Sarangi, S., Greskamp, B., Tiwari, A., and Torrellas, J. 2008a. Eval: Utilizing processors with variation-induced timing errors. In Proceedings of MICRO. 423--434.
[24]
Sarangi, S. R., Greskamp, B., Teodorescu, R., Nakano, J., Tiwari, A., and Torrellas, J. 2008b. VARIUS: A model of process variation and resulting timing errors for microarchitects. IEEE Trans. Semi. Manu. 21, 1, 3--13.
[25]
Schrijver, A. 1998. Theory of Linear and Integer Programming. Wiley.
[26]
Schrijver, A. 2003. Combinatorial Optimization: Polyhedra and Efficiency. Springer.
[27]
Skadron, K., Stan, M. R., Sankaranarayanan, K., Huang, W., Velusamy, S., and Tarjan, D. 2004. Temperature-aware microarchitecture: modeling and implementation. ACM Trans. Architect. Code Optim. 1, 1, 94--125.
[28]
Srinivasan, J., Adve, S. V., Bose, P., and Rivers, J. A. 2004. The impact of technology scaling on lifetime reliability. In Proceedings of Dependable Systems and Networks. 177--186.
[29]
Srinivasan, J., Adve, S. V., Bose, P., and Rivers, J. A. 2005. Exploiting structural duplication for lifetime reliability enhancement. In Proceedings of ISCA. 520--531.
[30]
Sun, J., Ma, D., Li, J., and Wang, J. M. 2008. Chebyshev-affine-arithmetic based parametric yield prediction under limited descriptions of uncertainty. IEEE Trans. Comput. Aid. Design Integ. Circ. Syst. 27, 10, 1852--1866.
[31]
Waldshmidt, K., Haase, J., Hofmann, A., Damm, M., and Hauser, D. 2006. Reliability-aware power management of multi-core systems (MPSOCS). In Proceedings of Dynamically Reconfigurable Architectures. 520--531.
[32]
Wang, W., Yang, S., Bhardwaj, S., Wattikonda, R., Vrudhula, S., Liu, F., and Cao, Y. 2007a. The impact of NBTI on the performance of combinational and sequential circuits. In Proceedings of DAC.
[33]
Wang, Y., Luo, H., He, K., Luo, R., Yang, H., and Xie, Y. 2007b. Temperature-aware NBTI modeling and the impact of input vector control on performance degradation. In Proceedings of DATE. 546--551.
[34]
Wolsey, L. A. and Nemhauser, G. L. 1999. Integer and Combinatorial Optimization. Wiley-Interscience.
[35]
Zhang, S., Wason, V., and Banerjee, K. 2004. A probabilistic framework to estimate full-chip threshold leakage power distribution considering within-die and die-to-die P-T-V variations. In Proceedings of ISLPED. 156--161.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Journal on Emerging Technologies in Computing Systems
ACM Journal on Emerging Technologies in Computing Systems  Volume 10, Issue 1
Special Issue on Reliability and Device Degradation in Emerging Technologies and Special Issue on WoSAR 2011
January 2014
210 pages
ISSN:1550-4832
EISSN:1550-4840
DOI:10.1145/2543749
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 13 January 2014
Accepted: 01 November 2012
Revised: 01 July 2012
Received: 01 March 2012
Published in JETC Volume 10, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Multicore systems
  2. dynamic task scheduling
  3. dynamic zoning
  4. negative bias temperature instability capacity rate
  5. workload balancing

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)1
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media