skip to main content
10.1145/1289816.1289841acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
Article

Thread warping: a framework for dynamic synthesis of thread accelerators

Published: 30 September 2007 Publication History

Abstract

We present a dynamic optimization technique, thread warping, that uses a single processor on a multiprocessor system to dynamically synthesize threads into custom accelerator circuits on FPGAs (field-programmable gate arrays). Building on dynamic synthesis for single-processor single-thread systems, known as warp processing, thread warping improves performances of multiprocessor systems by speeding up individual threads and by allowing more threads to execute concurrently. Furthermore, thread warping maintains the important separation of function from architecture, enabling portability of applications to architectures with different quantities of microprocessors and FPGA.an advantage not shared by static compilation/synthesis approaches. We introduce a framework of architecture, CAD tools, and operating system that together support thread warping. We summarize experiments on an extensive architectural simulation framework we developed, showing application speedups of 4x to 502x, averaging 130x compared to a multiprocessor system having four ARM11 microprocessors, for eight benchmark applications. Even compared to a 64-processor system, thread warping achieves 11x speedup.

References

[1]
Amerson, R., Carter, R., Culbertson, W., Kuekes, P., Snider, G., and Albertson, L. Plasma: an FPGA for million gate systems. In Proceedings of Int. Symp. on Field Programmable Gate Arrays (FPGA), 1996, 10--16.
[2]
Andrews, D., Niehaus, D., and Ashenden, P. Programming models for hybrid CPU/FPGA chips. IEEE Computer, 37, 1 (2004), 118--120.
[3]
Burger, D. and Austin, T. The simplescalar tool set, version 2.0. SIGARCH Computer Architecture News, 25, 3 (1997), 13--35.
[4]
Cifuentes, C. Reverse Compilation Techniques. PhD Thesis, Queensland University of Technology, 1994.
[5]
Cray XD1. http://www.cray.com/products/xd1, 2005.
[6]
Dellson, A., Sandberg, G., and Möhl, S. Turning FPGAs into Supercomputers. Cray User Group, 2006.
[7]
Eles, P., Peng, Z., Kuchchinski, K., and Doboli, A. System level hardware/software partitioning based on simulated annealing and tabu search. Journal on Design Automation for Embedded Systems (DAES), Springer, 2, 1 (1997), 5--32.
[8]
Fin, A., Fummi, F., and Signoretto, M. SystemC: a homogenous environment to test embedded systems. In Proceedings of Int. Workshop on Hardware/Software Codesign (CODES), 2001, 17--22.
[9]
Grimpe, E. and Oppenheimer, F. Extending the SystemC synthesis subset by object oriented features. In Proceedings of Int. Conf. on Hardware/Software Codesign and System Synthesis (CODES/ISSS), 2003, 25--30.
[10]
Guo, Z., Buyukkurt, A.B., and Najjar, W. Input data reuse in compiling window operations onto reconfigurable hardware. In Proceedings of Symposium on Languages, Compilers and Tools for Embedded Systems (LCTES), 2004, 249--256.
[11]
Gupta, S., Dutt, N., Gupta, R., and Nicolau, A. SPARK : a high-level synthesis framework for applying parallelizing compiler transformations. In Proceedings of Int. Conf. on VLSI Design, 2003.
[12]
Hill, M., Larus, J., Lebeck, A., Talluri, M., and Wood, D. Wisconsin architectural research tool set. SIGARCH Computer Architecture News. 21, 4 (1993).
[13]
IBM. The Cell Architecture. http://domino.research.ibm.com, 2006.
[14]
Schleupen, K., Lekuch, S., Mannion, R., Guo, Z., Najjar, W., and Vahid, F. Dynamic partial FPGA reconfiguration in a prototype microprocessor system. In Proceedings of Int. Conf. on Field Programmable Logic And Applications, 2007.
[15]
Intel Quad-Core Xeon. http://www.intel.com, 2007.
[16]
Jung, H. and Ha, S. Hardware synthesis from coarse-grained dataflow specification for fast hw/sw cosynthesis. In Proceedings of Int. Conf. on Hardware/Software Codesign and System Synthesis (CODES/ISSS), 2004, 24--29.
[17]
Koch, D., Haubelt, C., and Teich, J. Efficient hardware checkpointing: concepts, overhead analysis, and implementation. In Proceedings of Int. Symp. on Field Programmable Gate Arrays (FPGA), 2007, 188--196.
[18]
M. LaPedus. Intel Tips Teraflops Programmable Processor. EE Times, September 2006.
[19]
Lu, J., Chen, H., Yew, P., and Hsu, W. Design and implementation of a lightweight dynamic optimization system. Journal of Instruction-Level Parallelism, 6 (Jun 2004), 1--24.
[20]
Ludwig, S. Fast Hardware Synthesis Tools and a Reconfigurable Coprocessor. Ph.D. Thesis, ETH Zurich, 2005.
[21]
Lysecky, R., Stitt, G., and Vahid, F. Warp processors. ACM Transactions on Design Automation of Electronic Systems (TODAES), 11, 3 (2006), 659--681.
[22]
Lysecky, R., Vahid, F., and Tan, S. A study of the scalability of on-chip routing for just-in-time FPGA compilation. In Proceedings of IEEE Symp. on Field-Programmable Custom Computing Machines (FCCM), 2005, 57--62.
[23]
Mittal, G., Zaretsky, D., Tang, X., and Banerjee, P. Automatic translation of software binaries onto FPGAs. In Proceedings of ACM Design Automation Conference (DAC), 2004, 389--394.
[24]
De Micheli, G. Synthesis and Optimization of Digital Circuits. McGraw-Hill, 1994.
[25]
Rakhmatov, D. and Vrudhula, S. Hardware-software bipartitioning for dynamically reconfigurable systems. In Proceedings of Int. Workshop on Hardware/Software Co-Design (CODES), 2002, 145--150.
[26]
SGI Altix. http://www.sgi.com/products/servers/altix/
[27]
Stitt, G. and Vahid, F. New decompilation techniques for binary-level co-processor generation. In Proceedings of IEEE/ACM Int. Conf. on Computer-Aided Design (ICCAD), 2005, 547--554.
[28]
VxWorks RTOS. http://www.windriver.com/vxworks/, 2007.
[29]
Xilinx Virtex II Pro, http://www.xilinx.com, 2006.
[30]
Xilinx Virtex IV, http://www.xilinx.com, 2006.
[31]
Zhang, W., Calder, B., and Tullsen, D. An event-driven multithreaded dynamic optimization framework. In Proceedings of Int. Conf. on Parallel Architectures and Compilation Techniques (PACT), 2005, 87--98.

Cited By

View all

Index Terms

  1. Thread warping: a framework for dynamic synthesis of thread accelerators

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CODES+ISSS '07: Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
      September 2007
      284 pages
      ISBN:9781595938244
      DOI:10.1145/1289816
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 30 September 2007

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. FPGA
      2. dynamic synthesis
      3. just-in-time compilation
      4. multi-core
      5. synthesis
      6. thread warping
      7. threads
      8. warp processing

      Qualifiers

      • Article

      Conference

      ESWEEK07
      ESWEEK07: Third Embedded Systems Week
      September 30 - October 3, 2007
      Salzburg, Austria

      Acceptance Rates

      Overall Acceptance Rate 280 of 864 submissions, 32%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 05 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media