skip to main content
10.1145/3229631.3229651acmotherconferencesArticle/Chapter ViewAbstractPublication PagessamosConference Proceedingsconference-collections
research-article

Simplifying HW/SW integration to deploy multiple accelerators for CPU-FPGA heterogeneous platforms

Published: 15 July 2018 Publication History

Abstract

FPGAS became an interesting option for developing hardware accelerators due to their energy efficiency and recent improvements in CPU-FPGA communication speeds. In order to accelerate the development cycle, FPGA high-level synthesis tools have been developed such as Intel HLS, OpenCL, and OpenSPL. These tools aim to free the designer from having to know all the FPGA low-level details. However, in order to achieve high performance processing, the developer should still understand the details of the deeper system layers. Moreover, OpenCL usually consumes more resources/compile time than a design developed directly in RTL. In this work we propose a novel framework to automatically integrate hardware accelerator cores in a final architecture by generating a HW/SW interface for an Intel CPU-FPGA modern platform. Our goal is to simplify the Intel Open Programmable Acceleration Engine (OPAE) by introducing a novel abstraction layer with a simple stream protocol channel. The experimental results for a set of dataflow benchmarks show a performance of up to 131.7 Gops/s, and a power efficiency of up to 353.7 Gops/W even when we bound the memory bandwidth to 12 GB/s.

References

[1]
T. S. Abdelrahman. 2016. Accelerating K-means clustering on a tightly-coupled processor-FPGA heterogeneous system. In 2016 IEEE 27th International Conference on Application-specific Systems, Architectures and Processors (ASAP).
[2]
Amazon. 2018. Elastic Compute Cloud - Amazon EC2 - AWS. (apr 2018). http://aws.amazon.com/ec2/
[3]
L. B. da Silva et al. 2017. Exploring the dynamics of large-scale gene regulatory networks using hardware acceleration on a heterogeneous CPU-FPGA platform. In 2017 International Conference on ReConFigurable Computing and FPGAs (ReConFig).
[4]
A. M. Caulfield et al. 2016. A cloud-scale acceleration architecture. In 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[5]
Becker et al. 2016. Spatial Programming with OpenSPL. Springer International Publishing, 81--95.
[6]
Choi et al. 2016. A quantitative analysis on microarchitectures of modern CPU-FPGA platforms. In Design Automation Conference (DAC). ACM/EEE.
[7]
Cho et al. 2018. A Full-System VM-HDL Co-Simulation Framework for Servers with PCIe-Connected FPGAs. In International Symposium on Field-Programmable Gate Arrays (FPGA '18). ACM.
[8]
C. Zhang et al. 2016. High Throughput Large Scale Sorting on a CPU-FPGA Heterogeneous Platform. In 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 148--155.
[9]
F. A. M. Alves et al. 2017. Designing a collision detection accelerator on a heterogeneous CPU-FPGA platform. In 2017 International Conference on ReConFigurable Computing and FPGAs (ReConFig). 1--6.
[10]
Iordache et al. 2016. High Performance in the Cloud with FPGA Groups. In Proceedings of the 9th International Conference on Utility and Cloud Computing. ACM, New York, NY, USA.
[11]
L. Di Tucci et al. 2017. The Role of CAD Frameworks in Heterogeneous FPGA-Based Cloud Systems. In 2017 IEEE International Conference on Computer Design (ICCD).
[12]
Moss et al. 2018. A Customizable Matrix Multiplication Framework for the Intel HARPv2 Xeon+FPGA Platform: A Deep Learning Case Study. In International Symposium on Field-Programmable Gate Arrays (FPGA '18). ACM.
[13]
M. C. F. Chang et al. 2016. The SMEM Seeding Acceleration for DNA Sequence Alignment. In 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[14]
M. Ozdal et al. 2018. Guest Editors; Introduction: Hardware Accelerators for Data Centers. IEEE Design Test 35, 1 (2018), 5--6.
[15]
P. Colangelo et al. 2017. Application of convolutional neural networks on Intel Xeon processor with integrated FPGA. In 2017 IEEE High Performance Extreme Computing Conference (HPEC). 1--7.
[16]
Stitt et al. 2018. Scalable Window Generation for the Intel Broadwell+Arria 10 and High-Bandwidth FPGA Systems. In International Symposium on Field-Programmable Gate Arrays (FPGA '18). ACM.
[17]
Weisz et al. 2016. A Study of Pointer-Chasing Performance on Shared-Memory Processor-FPGA Systems. In Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA '16).
[18]
PK Gupta. 2016. Accelerating datacenter workloads. In 26th International Conference on Field Programmable Logic and Applications, Keynote - Slides available at www.fpl2016.org.
[19]
Intel. 2017. AOCL Programming Guide. (2017). https://www.altera.com/content/dam/altera-www/global/en_US/pdfs/literature/hb/opencl-sdk/aocl_programming_guide.pdf
[20]
Z. István, D. Sidler, and G. Alonso. 2016. Runtime Parameterizable Regular Expression Operators for Databases. In 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[21]
ENNO LUEBBERS. 2017. OPAE. (Nov 2017). https://01.org/OPAE
[22]
Ho-Cheung et al. Ng. 2018. ADAM: Automated Design Analysis and Merging for Speeding Up FPGA Development. In International Symposium on Field-Programmable Gate Arrays (FPGA '18). ACM.
[23]
Shinya Takamaeda-Yamazaki. 2015. Pyverilog: A Python-Based Hardware Design Processing Toolkit for Verilog HDL. (04 2015).
[24]
Chi Zhang and Viktor Prasanna. 2017. Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System. In International Symposium on Field-Programmable Gate Arrays.
[25]
Hamid Reza et al. Zohouri. 2018. Combined Spatial and Temporal Blocking for High-Performance Stencil Computation on FPGAs Using OpenCL. In International Symposium on Field-Programmable Gate Arrays (FPGA '18). ACM.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SAMOS '18: Proceedings of the 18th International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation
July 2018
263 pages
ISBN:9781450364942
DOI:10.1145/3229631
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 July 2018

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

SAMOS XVIII
SAMOS XVIII: Architectures, Modeling, and Simulation
July 15 - 19, 2018
Pythagorion, Greece

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)4
Reflects downloads up to 05 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media