skip to main content
research-article

A Hardware/Software Cooperative Custom Register Binding Approach for Register Spill Elimination in Application-Specific Instruction Set Processors

Published: 01 October 2012 Publication History

Abstract

Application-Specific Instruction set Processor (ASIP) has become an important design choice for embedded systems. It can achieve both high flexibility offered by the base processor core and high performance and energy efficiency offered by the dedicated hardware extensions. Although a lot of efforts have been devoted to computation acceleration, for example, automatic custom instruction identification and synthesis, limited on-chip data storage elements including the register file and data cache have become a potential performance bottleneck. For custom instructions that have more inputs and/or outputs than the generic register file I/O ports, custom registers are added in ASIPs to satisfy the need of additional inputs and outputs, and traditionally they are used only by custom instructions. In this article, we propose a hardware/software cooperative approach with a linear scan register allocation algorithm, which allows base instructions to utilize the existing custom registers in ASIPs for eliminating register spills of the program. The data traffic between the base processor and off-chip memory can be replaced with energy-efficient on-chip communications between the processor core and custom hardware extensions. Our experimental results demonstrate that a significant performance gain can be achieved, orthogonal to improvements by other techniques in ASIP design.

References

[1]
Aho, A. V., Sethi, R., and Ullman, J. D. 1986. Compilers: Principles, Techniques and Tools. Addison-Wesley, Reading, MA.
[2]
Altera. 2012. http://www.altera.com.
[3]
ARC International. 2012. http://www.arc.com.
[4]
ARM. 2012. ARM architecture manual. http://www.arm.com/miscPDFs/14128.pdf.
[5]
ASIP Meister. 2012. http://www.eda-meister.org/asipmeister.
[6]
Austin, T., Larson, E., and Ernst, D. 2002. SimpleScalar: An infrastructure for computer system modeling. IEEE Micro 35, 2, 59--67.
[7]
Clark, N. Kudlur, M., Park, H., Mahlke, S., and Flautner, K. 2004. Application-Specific processing on a general-purpose core via transparent instruction set customization. In Proceedings of the International Symposium on Microarchitecture. 30--40.
[8]
Cong, J., Fan, Y., Han, G., and Zhang, V. 2004. Application-Specific instruction generation for configurable processor architectures. In Proceedings of the ACM International Symposium on Field-Programmable Gate Arrays. 183--189.
[9]
Cong, J., Fan, Y., Han, G., Jagannathan, A., Reinman, G., and Zhang, Z. 2005a. Instruction set extension with shadow registers for configurable processors. In Proceedings of the ACM International Symposium on Field-Programmable Gate Arrays. 99--106.
[10]
Cong, J., Han, G., and Zhang, Z. 2005b. Architecture and compilation for data bandwidth improvement in configurable embedded processors. In Proceedings of the International Conference on Computer-Aided Design. 263--270.
[11]
Cooper, K. D., Harvey, T. J., and Kennedy, K. 2001. A simple, fast dominance algorithm. Softw. Pract. Exper. 4, 110.
[12]
Fischer, D., Teich, J., Thies, M., and Weper, R. 2002. Efficient architecture/compiler co-exploration for ASIPs. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems. 27--34.
[13]
Gonzalez, R. E. 2000. Xtensa: A configurable and extensible processor. IEEE Micro 20, 2, 60--70.
[14]
Goodwin, D. and Petkov, D. 2003. Automatic generation of application specific processors. In Proceedings of the International Conference on Compilers, Architecture and Synthesis of Embedded Systems. 137--147.
[15]
Guthaus, M. R., Ringenberg, J. S., Ernst, D., Austin, T. M., Mudge, T., and Brown, R. B. 2001. Mibench: A free, commercially representative embedded benchmark suite. In Proceedings of the IEEE International Workshop on Workload Characteristization. 3--14.
[16]
Improv Systems. 2012. http://www.improvsys.com.
[17]
Kastner, R., Kaplan, A., Ogrenci Memik, S., and Bozorgzadeh, E. 2002. Instruction generation for hybrid reconfigurable systems. ACM Trans. Des. Autom. Electron. Syst. 7, 4, 605--627.
[18]
Keutzer, K., Malik, S., and Newton, A. R. 2002. From ASIC to ASIP: The next design discontinuity. In Proceedings of the International Conference on Computer Design. 84--90.
[19]
Poletto, M. and Sarkar, V. 1999. Linear scan register allocation. ACM Trans. Program. Lang. Syst. 21, 5, 895--913.
[20]
Pozzi, L. and Ienne, P. 2005. Exploiting pipelining to relax register-file port constraints of instruction-set extensions. In Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems. 2--10.
[21]
Pozzi, L., Atasu, K., and Ienne, P. 2006. Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Trans. Comput.-Aid. Des. Integr. Circ. 25, 7, 1209--1229.
[22]
SimpleScalar. 2012. SimpleScalar portable instruction set architecture (PISA). http://www.simplescalar.com.
[23]
Skiena, S. 1990. Maximum independent set. In Implementing Discrete Mathematics: Combinatorics and Graph Theory with Mathematica. Addison-Welsey, Reading, MA, 218--219.
[24]
Sun, F., Ravi, S., Raghunathan, A., and Jha, N. K. 2004. Custom-instruction synthesis for extensible processor platform. IEEE Trans. Comput.-Aid. Des. Integr. Circ. 23, 2, 216--228.
[25]
Tensilica. 2012a. Tensilica tie application notes and examples. http://www.tensilica.com/products/literature-docs/application-notes/tie-application-notes/.
[26]
Tensilica. 2012b. http://www.tensilica.com.
[27]
Verma, A. K., Brisk, P., and Ienne, P. 2008. Fast, quasi-optimal, and pipelined instruction-set extensions. In Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC’08). 334--39.
[28]
Vertex. 2012. Vetex ordering. http://en.wikipedia.org/wiki/Depth-first_search#Vertex_orderings.
[29]
Wehmeyer, L., Jain, M. K., Steinke, S., Marwedel, P., and Balakrishnan, M. 2001. Analysis of the influence of register file size on energy consumption, code size, and execution time. IEEE Trans. Comput.-Aid. Des. Integr. Circ. 20, 11, 1329--1337.
[30]
Xilinx. 2012. http://www.xilinx.com.
[31]
Yum, P. and Mitra, T. 2004. Characterizing embedded applications for instruction-set extensible processors. In Proceedings of the Design Automation Conference (DAC’04). 723--728.

Index Terms

  1. A Hardware/Software Cooperative Custom Register Binding Approach for Register Spill Elimination in Application-Specific Instruction Set Processors

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Design Automation of Electronic Systems
    ACM Transactions on Design Automation of Electronic Systems  Volume 17, Issue 4
    October 2012
    347 pages
    ISSN:1084-4309
    EISSN:1557-7309
    DOI:10.1145/2348839
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 01 October 2012
    Accepted: 01 February 2012
    Revised: 01 August 2011
    Received: 01 July 2009
    Published in TODAES Volume 17, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Application-specific instruction set processor
    2. custom registers
    3. memory traffic reduction
    4. register spills

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 160
      Total Downloads
    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 24 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media