skip to main content
article

Improving Code Density with Variable Length Encoding Aware Instruction Scheduling

Published: 01 September 2016 Publication History

Abstract

Variable length encoding can considerably decrease code size in VLIW processors by reducing the number of bits wasted on encoding No Operations(NOPs). A processor may have different instruction templates where different execution slots are implicitly NOPs, but all combinations of NOPs may not be supported by the instruction templates. The efficiency of the NOP encoding can be improved by the compiler trying to place NOPs in such way that the usage of implicit NOPs is maximized. Two different methods of optimizing the use of the implicit NOP slots are evaluated: (a) prioritizing function units that have fewer implicit NOPs associated with them and (b) a post-pass to the instruction scheduler which utilizes the slack of the schedule by rescheduling operations with slack into different instruction words so that the available instruction templates are better utilized. Three different methods for selecting basic blocks to apply FU priorization on are also analyzed: always, always outside inner loops, and only outside inner loops only in basic blocks after testing where it helped to decrease code size. The post-pass optimizer alone saved an average of 2.4 % and a maximum of 10.5 % instruction memory, without performance loss. Prioritizing function units in only those basic blocks where it helped gave the best case instruction memory savings of 10.7 % and average savings of 3.0 % in exchange for an average 0.3 % slowdown. Applying both of the optimizations together gave the best case code size decrease of 12.2 % and an average of 5.4 %, while performance decreased on average by 0.1 %.

References

[1]
Corporaal, H., & Arnold, M. (1998). Using Transport Triggered Architectures for embedded processor design. Integrated Computer-Aided Engineering, 5(1), 19---38.
[2]
Conte, T.M., Banerjia, S., Larin, S.Y., Menezes, K.N., & Sathaye, S.W. (1996). Instruction fetch mechanisms for VLIW architectures with compressed encodings. In Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture (pp. 201---211).
[3]
Aditya, S., Mahlke, S. A., & Rau, B. R. (2000). Code size minimization and retargetable assembly for custom EPIC and VLIW instruction formats. ACM Transactions on Design Automation of Electronic Systems, 5(4), 752---773.
[4]
Helkala, J., Viitanen, T., Kultala, H., Jääskeläinen, P., Takala, J., Zetterman, T., & Berg, H. (2014). Variable length instruction compression on transport triggered architectures. In Proceedings of the International Conference on Embedded Computing Systems: Architectures Modeling and Simulation (pp. 149---155). Samos, Greece.
[5]
Kultala, H., Viitanen, T., Jääskelainen, P., Helkala, J., & Takala, J. (2014). Compiler optimizations for code density of variable length instructions. In Proceedings of the IEEE Workshop on Signal Processing Systems (pp. 1---6).
[6]
Lee, C., Lee, J.K., & Hwang, T. (2000). Compiler optimization on instruction scheduling for low power. In Proceedings of the 13th International Symposium on System Synthesis (pp. 55---60).
[7]
Hahn, T.T., Stotzer, E., Sule, D., & Asal, M. (2008). Compilation strategies for reducing code size on a VLIW processor with variable length instructions. In Proceedings of the 3rd International Conference on High Performance Embedded Architectures and Compilers (pp. 147---160). Berlin Heidelberg: Springer-Verlag.
[8]
Stotzer, E.J., & Leiss, E.L. (2012). Co-design of compiler and hardware techniques to reduce program code size on a vliw processor. CLEI Electronic Journal, 15(2), 2---2.
[9]
Jee, S., & Palaniappan, K. (2002). Performance evaluation for a compressed-VLIW processor. In Proceedings of the ACM Symposium on Applied Computing (pp. 913---917).
[10]
Ros, M., & Sutton, P. (2005). A post-compilation register reassignment technique for improving hamming distance code compression. In Proceedings of the 2005 International Conference on Compilers, Architectures and Synthesis for Embedded Systems (pp. 97---104).
[11]
Larin, S.Y., & Conte, M.T. (1999). Compiler-driven cached code compression schemes for embedded ilp processors. In Proceedings of the 32nd Annual IEEE/ACM International Symposium on Microarchitecture (pp. 82---92): IEEE.
[12]
Haga, S., Webber, A., Zhang, Y., Nguyen, N., & Barua, R. (2005). Reducing code size in VLIW instruction scheduling. Journal of Embedded Computing, 1(3), 415---433.
[13]
Haga, S., & Barua, R. (2001). EPIC instruction scheduling based on optimal approaches. In Proceedings of the First Annual Workshop on Explicitly Parallel Instruction Computing Architectures and Compiler Technology (pp. 22---31).
[14]
Muchnick, S.S. (1997). Advanced Compiler Design and Implementation: Morgan Kaufmann.
[15]
Hara, Y., Tomiyama, H., Honda, S., & Takada, H. (2009). Proposal and quantitative analysis of the CHStone benchmark program suite for practical C-based high-level synthesis. Journal of Information Processing, 17, 242---254.
[16]
Jääskeläinen, P., Guzma, V., Cilio, A., & Takala, J. (2007). Codesign toolset for application-specific instruction-set processors. In Proceedings of SPIE Multimedia on Mobile Devices (pp. 65070X---1 --- 65070X---11).
[17]
Viitanen, T., Kultala, H., Jääskeläinen, P., & Takala, J. (2014). Heuristics for greedy transport triggered architecture interconnect exploration. In Proceedings of the 2014 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (pp. 2:1---2:7).
[18]
Fisher, J.A., Faraboschi, P., & Young, C. (2005). Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools: Elsevier.
  1. Improving Code Density with Variable Length Encoding Aware Instruction Scheduling

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Journal of Signal Processing Systems
      Journal of Signal Processing Systems  Volume 84, Issue 3
      September 2016
      148 pages
      ISSN:1939-8018
      EISSN:1939-8115
      Issue’s Table of Contents

      Publisher

      Kluwer Academic Publishers

      United States

      Publication History

      Published: 01 September 2016

      Author Tags

      1. Code density
      2. Code optimization
      3. Instruction scheduling
      4. Instruction templates
      5. Variable length instructions
      6. tta
      7. vliw

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 0
        Total Downloads
      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 04 Feb 2025

      Other Metrics

      Citations

      View Options

      View options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media