skip to main content
10.1145/3195970.3196088acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article
Public Access

Extracting data parallelism in non-stencil kernel computing by optimally coloring folded memory conflict graph

Published: 24 June 2018 Publication History

Abstract

Irregular memory access pattern in non-stencil kernel computing renders the well-known hyperplane- [1], lattice- [2], or tessellation-based [3] HLS techniques ineffective. We develop an elegant yet effective technique that synthesizes memory-optimal architecture from high level software code in order to maximize application-specific data parallelism. Our basic idea is to exploit graph structures embedded in data access pattern and computation structure in order to perform the memory banking that maximizes parallel memory accesses while conserving both hardware and energy consumption. Specifically, we priority color a weighted conflict graph generated from folding the fundamental conflict graph to maximize memory conflict reduction. Most interestingly, our graph-based methodology enables a straightforward tradeoff between the number of memory banks and minimizing memory conflicts.
We empirically test our methodology with Vivado HLx 2015.4 on a standard Kintex-7 device for six benchmark computing kernels by measuring conflict reduction. In particular, our approach only require 9.56% LUT, 3.2% FF, 2.5% BRAM, and 11.33% DSP of the total available hardware resource to obtain a mapping function that achieves a 90% conflict reduction on a modified forward Gaussian elimination Kernel with 4 simultaneous memory accesses.

References

[1]
Y. Wang, P. Li, and J. Cong, "Theory and algorithm for generalized memory partitioning in high-level synthesis," in Proceedings of the 2014 ACM/SIGDA International Symposium on Field-programmable Gate Arrays, FPGA '14, (New York, NY, USA), pp. 199--208, ACM, 2014.
[2]
A. Cilardo and L. Gallo, "Improving multibank memory access parallelism with lattice-based partitioning," ACM Trans. Archit. Code Optim., vol. 11, pp. 45:1--45:25, 2015.
[3]
J. Escobedo and M. Lin, "Tessellating memory space for parallel access," in ASP-DAC, 2017.
[4]
C. Meng, S. Yin, P. Ouyang, L. Liu, and S. Wei, "Efficient memory partitioning for parallel data access in multidimensional arrays," in 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC), pp. 1--6, June 2015.
[5]
Y. Zhou, K. M. Al-Hawaj, and Z. Zhang, "A new approach to automatic memory banking using trace-based address mining," in Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA '17, (New York, NY, USA), pp. 179--188, ACM, 2017.
[6]
Y. Wang, P. Li, and J. Cong, "Theory and algorithm for generalized memory partitioning in high-level synthesis," in Proceedings of the 2014 ACM/SIGDA International Symposium on FPGA, pp. 199--208, 2014.
[7]
N. Christofides, "An algorithm for the chromatic number of a graph," The computer journal, vol. 14 (1), pp. 38--39, 1971.
[8]
A. Wigderson, "Improving the performance guarantee for approximate graph coloring," Journal of the Association for Computing Machinery, vol. 30, pp. 729--735, 1983.
[9]
E.-K. E and E.-E. A, "Graph folding of some special graphs," Journal of Mathematics and Statistics, vol. 1, 01 2005.
[10]
J. Xue, "Solving the minimum weighted integer coloring problem," Comput. Optim. Appl., vol. 11, pp. 53--64, Oct. 1998.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '18: Proceedings of the 55th Annual Design Automation Conference
June 2018
1089 pages
ISBN:9781450357005
DOI:10.1145/3195970
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 June 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph coloring
  2. graph folding
  3. memory conflict reduction

Qualifiers

  • Research-article

Funding Sources

Conference

DAC '18
Sponsor:
DAC '18: The 55th Annual Design Automation Conference 2018
June 24 - 29, 2018
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)94
  • Downloads (Last 6 weeks)13
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media