skip to main content
10.1145/3635035.3635051acmotherconferencesArticle/Chapter ViewAbstractPublication PageshpcasiaConference Proceedingsconference-collections
research-article

Efficient GPU-Implementation of H-P Sort Based on Improved Histogram Computation

Published: 19 January 2024 Publication History

Abstract

We present an enhanced GPU implementation of the H-P sort algorithm, which is a widely used method for integer sorting based on histogram computation and prefix sum calculation. This work extends a previous high-performance GPU version of the algorithm, presented at the 2021 ICPP international conference. Through algorithmic refinements and optimizations, we demonstrate significant performance improvements compared to the conventional H-P sort. It achieves a maximum speedup of 3.45x over conventional H-P Sort. Furthermore, in scenarios where the H-P sort had previously lagged behind the fastest available library, our improved implementation now surpasses it in terms of execution speed. This advancement in GPU-based sorting techniques contributes to more efficient data processing in various computational applications, emphasizing the practical significance of this work.

References

[1]
Guy E. Blelloch. 1990. Prefix Sums and Their Applications. Technical Report CMU-CS-90-190. School of Computer Science, Carnegie Mellon University.
[2]
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. 2010. Introduction to Algorithms Third Edition. PHI Learning.
[3]
S. C. Eisenstat. 2007. O(log *n) algorithms on a Sum-CRCW PRAM. Computing 79 (2007), 93–97.
[4]
N. Faujdar and S. Ghrera. 2016. Performance evaluation of parallel count sort using GPU computing with CUDA. Indian Journal of Science and Technoogy 9, 15 (2016), 1–12.
[5]
J. Gómez-Luna, J.M. González-Linares, J.I. Benavides, and et al.2013. An optimized approach to histogram computation on GPU. Machine Vision and Applications 24 (2013), 899–908.
[6]
T. Henriksen, S. Hellfritzsch, P. Sadayappan, and C. Oancea. 2020. Compiling Generalized Histograms for GPU. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. 1–14.
[7]
D. E. Knuth. 1998. The art of computer programming, vol.3: searching and sorting (second ed.). Addison-Wesley Professional.
[8]
V. Kolonias, A. G. Voyiatzis, G. Goulas, and E. Housos. 2011. Design and implementation of an efficient integer count sort in CUDA GPUs. Concurrency and Computation: Practice and Experience 23 (2011), 2365–2381.
[9]
S. Koppaka, D. Mudigere, S. Narasimhan, and B. Narayanan. 2010. Fast Histograms using Adaptive CUDA Streams. arXiv:1011.0235 [cs.DC] (2010).
[10]
S. Kozakai, N. Fujimoto, and K. Wada. 2021. Efficient GPU-implementation for integer sorting based on histogram and prefix-sums. In Proceedings of the 50th International Conference on Parallel Processing (ICPP). 1–11.
[11]
V. Podlozhnyuk. 2007. Histogram Calculation in CUDA. (2007).
[12]
R. Shams and R. A. Kennedy. 2007. Efficient histogram algorithms for NVIDIA CUDA compatible devices. In Proc. Int. Conf. on Signal Processing and Communications Systems (ICSPCS). 418–422.
[13]
W. Sum and Z. Ma. 2009. Count sort for GPU computing. In Proceedings of 2009 15th International Conferece on Parallel and Distributed Systems (ICPDS). 919–924.
[14]
J. Svenningsson, B. J. Svensson, and M. Sheeran. 2013. Counting and occurrence sort for GPUs using an embedded language. In Proceedings of the 2nd ACM SIGPLAN workshop on Functional high-performance computing, FHPC’13. 37–46.
[15]
M. Ugljesa, G. Isaac, P. Nikola, R. Alex, and T. Milo. 2013. Parallelizing general histogram application for CUDA architectures. In 2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS). 11–18.
[16]
K. K. Yong and S. S. O. Talib. 2016. Histogram optimization with CUDA. In 2016 IEEE Industrial Electronics and Applications Conference (IEACon). 312–318.

Index Terms

  1. Efficient GPU-Implementation of H-P Sort Based on Improved Histogram Computation
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    HPCAsia '24: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region
    January 2024
    185 pages
    ISBN:9798400708893
    DOI:10.1145/3635035
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 January 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. GPU
    2. H-P sort
    3. Histogram
    4. Integer sorting
    5. Prefix sum
    6. parallel sorting

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    HPCAsia 2024

    Acceptance Rates

    Overall Acceptance Rate 69 of 143 submissions, 48%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 61
      Total Downloads
    • Downloads (Last 12 months)61
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 07 Nov 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media