skip to main content
10.1145/2370816.2370879acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
poster

Branch and data herding: reducing control and memory divergence for error-tolerant GPU applications

Published: 19 September 2012 Publication History

Abstract

Control and memory divergence between threads in the same execution bundle, or warp, can significantly throttle the performance of GPU applications. We exploit the observation that many GPU applications exhibit error tolerance to propose branch and data herding. Branch herding eliminates control divergence by forcing all threads in a warp to take the same control path. Data herding eliminates memory divergence by forcing each thread in a warp to load from the same memory block. To safely and efficiently support branch and data herding, we propose a static analysis and compiler framework to prevent exceptions when control and data errors are introduced, a profiling framework that aims to maximize performance while maintaining acceptable output quality, and hardware optimizations to improve the performance benefits of exploiting error tolerance through branch and data herding. Our software implementation of branch herding on NVIDIA GeForce GTX 480 improves performance by up to 34% (13%, on average) for a suite of NVIDIA CUDA SDK and Parboil benchmarks. Our hardware implementation of branch herding improves performance by up to 55% (30%, on average). Data herding improves performance by up to 32% (25%, on average). Observed output quality degradation is minimal for several applications that exhibit error tolerance, especially for visual computing applications.

References

[1]
S. Byna, J. Meng, A. Raghunathan, S. Chakradhar, and S. Cadambi. Best-effort semantic document search on GPUs. In GPGPU, pages 86--93, 2010.
[2]
W. Fung, I. Sham, G. Yuan, and T. Aamodt. Dynamic warp formation and scheduling for efficient GPU control flow. In MICRO, pages 407--420, 2007.
[3]
J. Meng, D. Tarjan, and K. Skadron. Dynamic warp subdivision for integrated branch and memory divergence tolerance. In ISCA, pages 235--246, 2010.
[4]
NVIDIA. NVIDIA Compute PTX: Parallel Thread Execution, 2009.
[5]
NVIDIA. NVIDIA CUDA Programming Guide, Vers. 3.0, 2010.
[6]
J. Sartori and R. Kumar. Branch and data herding: Reducing control and memory divergence for error-tolerant gpu applications. Multimedia, IEEE Transactions on, 2012.
[7]
The IMPACT Research Group. Parboil benchmark suite. http://impact.crhc.illinois.edu/parboil.php.
[8]
G. Varatkar and N. Shanbhag. Energy-efficient motion estimation using error-tolerance. In ISLPED, pages 113--118, 2006.
[9]
T. Yeh, P. Faloutsos, M. Ercegovac, S. Patel, and G. Reinman. The art of deception: Adaptive precision reduction for area efficient physics acceleration. In MICRO, pages 394 --406, 2007.

Cited By

View all

Index Terms

  1. Branch and data herding: reducing control and memory divergence for error-tolerant GPU applications

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PACT '12: Proceedings of the 21st international conference on Parallel architectures and compilation techniques
    September 2012
    512 pages
    ISBN:9781450311823
    DOI:10.1145/2370816

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 September 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. control divergence
    2. error tolerance
    3. gpgpu
    4. high performance
    5. memory divergence

    Qualifiers

    • Poster

    Conference

    PACT '12
    Sponsor:
    • IFIP WG 10.3
    • SIGARCH
    • IEEE CS TCPP
    • IEEE CS TCAA

    Acceptance Rates

    Overall Acceptance Rate 121 of 471 submissions, 26%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 25 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media