skip to main content
10.1007/978-3-031-42921-7_4guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

On the OpenCL Support for Streaming Fixed-Function Accelerators on Embedded SoC FPGAs

Published: 27 September 2023 Publication History

Abstract

OpenCL is used in contemporary FPGA High-level Synthesis (HLS) design tools for the development of the host-side code which controls the data transfer between the processing system and the FPGA design. High performance FPGA designs in embedded SoC FPGAs often make use of data movers with streaming capabilities for the direct data transfer between the host’s main memory and the local memory of the FPGA accelerator. Unfortunately, the OpenCL memory model does not currently support streaming data movement between the host system and the FPGA accelerator. Earlier work has shown up to 8x latency improvement in data transfer when streaming data movement is used. To emphasize on this important issue, this work extends the Portable Computing Language (PoCL) OpenCL framework to support direct streaming data movement between the host’s main memory and the accelerator’s local memory. Furthermore, this work uses the CNN-Grinder workflow to map the execution of a traffic sign recognition Convolutional Neural Network (CNN) on the SqueezeJet-3 FPGA accelerator in order to showcase the details of controlling the SqueezeJet-3 streaming accelerator from a PoCL application. Results show that it is possible to achieve high performance accelerator execution and efficiently control an FPGA streaming accelerator on an embedded SoC FPGA using OpenCL augmented with direct streaming data transfer capabilities between the host and the kernel.

References

[1]
AMD Xilinx: Vitis Unified Software Platform Documentation, Application Acceleration Development, UG1393 (v2022.2), 7 December 2022. https://docs.xilinx.com/viewer/book-attachment/aJhJw9uEf3GPMuRNo0jveg/5jCMHSlRPIRfufLlzZMsOQ. Accessed 31 Mar 2023
[2]
Cong J et al. FPGA HLS today: successes, challenges, and opportunities ACM Trans. Reconfigurable Technol. Syst. (TRETS) 2022 15 4 1-42
[3]
Gysel P, Pimentel J, Motamedi M, and Ghiasi S Ristretto: a framework for empirical study of resource-efficient inference in convolutional neural networks IEEE Trans. Neural Networks Learn. Syst. 2018 29 11 5784-5789
[4]
Hoozemans J, Van Straten J, Viitanen T, Tervo A, Kadlec J, and Al-Ars Z ALMARVI execution platform: heterogeneous video processing SoC platform on FPGA J. Sig. Process. Syst. 2019 91 61-73
[5]
HSA™ Foundation: HSA Platform System Architecture Specification v1.2. http://hsa.glossner.org/wp-content/uploads/2021/02/HSA-SysArch-1.2.pdf. Accessed 31 Mar 2023
[6]
Intel: Intel® FPGA SDK for OpenCL™ Pro Edition: Programming Guide. https://cdrdv2.intel.com/v1/dl/getContent/749418?fileName=aocl_programming_guide-683846-749418.pdf. Accessed 31 Mar 2023
[7]
Jääskeläinen, P., Sanchez de La Lama, C., Schnetter, E., Raiskila, K., Takala, J., Berg, H.: pocl: a performance-portable OpenCL implementation. Int. J. Parallel Program. 43(5), 752–785 (2015)
[8]
Kang, K., Yiannacouras, P.: Host pipes: direct streaming interface between OpenCL host and Kernel. In: Proceedings of the 5th International Workshop on OpenCL, pp. 1–2 (2017)
[9]
Khronos® OpenCL Working Group: The OpenCL™ Specification. https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/OpenCL_API.pdf. Accessed 31 Mar 2023
[10]
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
[11]
Lahti S, Sjövall P, Vanne J, and Hämäläinen TD Are we there yet? A study on the state of high-level synthesis IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 2018 38 5 898-911
[12]
Leppänen, T., Lotvonen, A., Jääskeläinen, P.: Cross-vendor programming abstraction for diverse heterogeneous platforms. Frontiers Comput. Sci. 4 (2022)
[13]
Leppänen, T., Lotvonen, A., Mousouliotis, P., Multanen, J., Keramidas, G., Jääskeläinen, P.: Efficient OpenCL system integration of non-blocking FPGA accelerators. Microprocess. Microsyst., 104772 (2023)
[14]
Leppänen, T., Mousouliotis, P., Keramidas, G., Multanen, J., Jääskeläinen, P.: Unified OpenCL integration methodology for FPGA designs. In: 2021 IEEE Nordic Circuits and Systems Conference (NorCAS), pp. 1–7. IEEE (2021)
[15]
Mousouliotis, P., Tampouratzis, N., Papaefstathiou, I.: SqueezeJet-3: an HLS-based accelerator for edge CNN applications on SoC FPGAs. In: 2023 XXIX International Conference on Information, Communication and Automation Technologies (ICAT), pp. 1–6. IEEE (2023)
[16]
Mousouliotis PG and Petrou LP CNN-grinder: from algorithmic to high-level synthesis descriptions of CNNs for low-end-low-cost FPGA SoCs Microprocess. Microsyst. 2020 73
[17]
Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: The German traffic sign recognition benchmark: a multi-class classification competition. In: The 2011 International Joint Conference on Neural Networks, pp. 1453–1460. IEEE (2011)

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
Applied Reconfigurable Computing. Architectures, Tools, and Applications: 19th International Symposium, ARC 2023, Cottbus, Germany, September 27–29, 2023, Proceedings
Sep 2023
379 pages
ISBN:978-3-031-42920-0
DOI:10.1007/978-3-031-42921-7
  • Editors:
  • Francesca Palumbo,
  • Georgios Keramidas,
  • Nikolaos Voros,
  • Pedro C. Diniz

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 27 September 2023

Author Tags

  1. OpenCL
  2. FPGA
  3. CNN Accelerator
  4. High-Level Synthesis

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media