Floating-point data compression at 75 Gb/s on a GPU
MA O'Neil, M Burtscher - Proceedings of the Fourth Workshop on …, 2011 - dl.acm.org
Proceedings of the Fourth Workshop on General Purpose Processing on Graphics …, 2011•dl.acm.org
Numeric simulations often generate large amounts of data that need to be stored or sent to
other compute nodes. This paper investigates whether GPUs are powerful enough to make
real-time data compression and decompression possible in such environments, that is,
whether they can operate at the 32-or 40-Gb/s throughput of emerging network cards. The
fastest parallel CPU-based floating-point data compression algorithm operates below 20
Gb/s on eight Xeon cores, which is significantly slower than the network speed and thus …
other compute nodes. This paper investigates whether GPUs are powerful enough to make
real-time data compression and decompression possible in such environments, that is,
whether they can operate at the 32-or 40-Gb/s throughput of emerging network cards. The
fastest parallel CPU-based floating-point data compression algorithm operates below 20
Gb/s on eight Xeon cores, which is significantly slower than the network speed and thus …
Numeric simulations often generate large amounts of data that need to be stored or sent to other compute nodes. This paper investigates whether GPUs are powerful enough to make real-time data compression and decompression possible in such environments, that is, whether they can operate at the 32- or 40-Gb/s throughput of emerging network cards. The fastest parallel CPU-based floating-point data compression algorithm operates below 20 Gb/s on eight Xeon cores, which is significantly slower than the network speed and thus insufficient for compression to be practical in high-end networks. As a remedy, we have created the highly parallel GFC compression algorithm for double-precision floating-point data. This algorithm is specifically designed for GPUs. It compresses at a minimum of 75 Gb/s, decompresses at 90 Gb/s and above, and can therefore improve internode communication throughput on current and upcoming networks by fully saturating the interconnection links with compressed data.
![](/https/scholar.google.com/scholar/images/qa_favicons/acm.org.png)