research-article

Concurrent analytical query processing with GPUs

Authors:

Xiaodong ZhangAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 7, Issue 11

Pages 1011 - 1022

https://doi.org/10.14778/2732967.2732976

Published: 01 July 2014 Publication History

Abstract

In current databases, GPUs are used as dedicated accelerators to process each individual query. Sharing GPUs among concurrent queries is not supported, causing serious resource underutilization. Based on the profiling of an open-source GPU query engine running commonly used single-query data warehousing workloads, we observe that the utilization of main GPU resources is only up to 25%. The underutilization leads to low system throughput.

To address the problem, this paper proposes concurrent query execution as an effective solution. To efficiently share GPUs among concurrent queries for high throughput, the major challenge is to provide software support to control and resolve resource contention incurred by the sharing. Our solution relies on GPU query scheduling and device memory swapping policies to address this challenge. We have implemented a prototype system and evaluated it intensively. The experiment results confirm the effectiveness and performance advantage of our approach. By executing multiple GPU queries concurrently, system throughput can be improved by up to 55% compared with dedicated processing.

References

[1]

code.google.com/p/gpudb.

[2]

monetdb.org.

[3]

docs.nvidia.com/cuda/cuda-runtime-api/api-sync-behavior.html.

[4]

N. Bandi, C. Sun, D. Agrawal, and A. El Abbadi. Hardware acceleration in commercial databases: A case study of spatial operations. In VLDB, 2004.

Digital Library

[5]

S. Bress. Why it is time for a HyPE: A hybrid query processing engine for efficient GPU coprocessing in DBMS. Proc. VLDB Endow., 6(12), 2013.

Digital Library

[6]

N. Govindaraju, J. Gray, R. Kumar, and D. Manocha. GPUTeraSort: High performance graphics co-processor sorting for large database management. In SIGMOD, pages 325--336, 2006.

Digital Library

[7]

N. K. Govindaraju, B. Lloyd, W. Wang, M. Lin, and D. Manocha. Fast computation of database operations using graphics processors. In SIGMOD, 2004.

Digital Library

[8]

B. He, K. Yang, R. Fang, M. Lu, N. Govindaraju, Q. Luo, and P. Sander. Relational joins on graphics processors. In SIGMOD, pages 511--524, 2008.

Digital Library

[9]

B. He and J. X. Yu. High-throughput transaction executions on graphics processors. Proc. VLDB Endow., 4(5): 314--325, 2011.

Digital Library

[10]

M. Heimel and V. Markl. A first step towards GPU-assisted query optimization. In ADMS, 2012.

[11]

M. Heimel, M. Saecker, H. Pirk, S. Manegold, and V. Markl. Hardware-oblivious parallelism for in-memory column-stores. Proc. VLDB Endow., 6(9): 709--720, 2013.

Digital Library

[12]

T. Kaldewey, G. Lohman, R. Mueller, and P. Volk. GPU join processing revisited. In DaMoN, pages 55--62, 2012.

Digital Library

[13]

S. Kato, K. Lakshmanan, R. Rajkumar, and Y. Ishikawa. TimeGraph: GPU scheduling for real-time multi-tasking environments. In USENIX ATC, pages 2--2, 2011.

Digital Library

[14]

S. Kato, M. McThrow, C. Maltzahn, and S. Brandt. Gdev: First-class GPU resource management in the operating system. In USENIX ATC, 2012.

Digital Library

[15]

Khronos OpenCL Working Group. The OpenCL Specification, version 2.0, 2013.

[16]

R. Lee, T. Luo, Y. Huai, F. Wang, Y. He, and X. Zhang. YSmart: Yet another SQL-to-MapReduce translator. In ICDCS, pages 25--36, 2011.

Digital Library

[17]

T. Mostak. An overview of MapD (massively parallel database). MIT Technical Report, 2013.

[18]

O. Mutlu and T. Moscibroda. Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared dram systems. In ISCA, pages 63--74, 2008.

Digital Library

[19]

T. Ni. DirectCompute: Bring GPU computing to the mainstream. In GTC, 2009.

[20]

NVIDIA. CUDA C programming guide, 2013.

[21]

P. O'Neil, B. O'Neil, and X. Chen. Star schema benchmark. cs.umb.edu/poneil/StarSchemaB.PDF.

[22]

H. Pirk, S. Manegold, and M. Kersten. Waste not... efficient co-processing of relational data. In ICDE, 2014.

[23]

H. Pirk, S. Manegold, and M. L. Kersten. Accelerating foreign-key joins using asymmetric memory channels. In VLDB, pages 27--35, 2011.

[24]

C. J. Rossbach, J. Currey, M. Silberstein, B. Ray, and E. Witchel. PTask: Operating system abstractions to manage GPUs as compute devices. In SOSP, 2011.

Digital Library

[25]

N. Satish, C. Kim, J. Chhugani, A. D. Nguyen, V. W. Lee, D. Kim, and P. Dubey. Fast sort on CPUs and GPUs: A case for bandwidth oblivious SIMD sort. In SIGMOD, pages 351--362, 2010.

Digital Library

[26]

E. A. Sitaridi and K. A. Ross. Ameliorating memory contention of OLAP operators on GPU processors. In DaMoN, 2012.

Digital Library

[27]

A. Snavely and D. M. Tullsen. Symbiotic jobscheduling for a simultaneous multithreaded processor. In ASPLOS, pages 234--244, 2000.

Digital Library

[28]

K. Wang, X. Ding, R. Lee, S. Kato, and X. Zhang. GDM: device memory management for GPGPU computing. In SIGMETRICS, 2014.

Digital Library

[29]

K. Wang, Y. Huai, R. Lee, F. Wang, X. Zhang, and J. H. Saltz. Accelerating pathology image data cross-comparison on CPU-GPU hybrid systems. Proc. VLDB Endow., 5(11): 1543--1554, 2012.

Digital Library

[30]

H. Wu, G. Diamos, S. Cadambi, and S. Yalamanchili. Kernel weaver: Automatically fusing database primitives for efficient GPU computation. In Micro, pages 107--118, 2012.

Digital Library

[31]

S. Yalamanchili. Scaling data warehousing applications using GPUs. In FastPath, 2013.

[32]

Y. Yuan, R. Lee, and X. Zhang. The Yin and Yang of processing data warehousing queries on GPU devices. Proc. VLDB Endow., 6(10): 817--828, 2013.

Digital Library

Cited By

Wang FLee RTeng DZhang XSaltz J(2024)High-Performance Spatial Data Analytics: Systematic R&D for Scale-Out and Scale-Up Solutions from the Past to NowProceedings of the VLDB Endowment10.14778/3685800.368591217:12(4507-4520)Online publication date: 8-Nov-2024
https://dl.acm.org/doi/10.14778/3685800.3685912
Zhang YZhang FLi HZhang SGuo XChen YPan ADu X(2024)Data-Aware Adaptive Compression for Stream ProcessingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337771036:9(4531-4549)Online publication date: 19-Mar-2024
https://dl.acm.org/doi/10.1109/TKDE.2024.3377710
Henneberg JSchuhknecht F(2023)RTIndeX: Exploiting Hardware-Accelerated GPU Raytracing for Database IndexingProceedings of the VLDB Endowment10.14778/3625054.362506316:13(4268-4281)Online publication date: 1-Sep-2023
https://dl.acm.org/doi/10.14778/3625054.3625063
Show More Cited By

Concurrent analytical query processing with GPUs
1. Information systems
  1. Data management systems
    1. Database management system engines
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Database theory

Recommendations

Parallel spatial query processing on GPUs using R-trees
BigSpatial '13: Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data

R-Trees are popular spatial indexing techniques that have been widely adopted in many geospatial applications. As commodity GPUs (Graphics Processing Units) are increasingly becoming available on personal workstations and cluster computers, there are ...
The Case for SIMDified Analytical Query Processing on GPUs
DAMON '21: Proceedings of the 17th International Workshop on Data Management on New Hardware

Data-level parallelism (DLP) is a heavily used hardware-driven parallelization technique to optimize the analytical query processing, especially in in-memory column stores. This kind of parallelism is characterized by executing essentially the same ...
Relational Joins on GPUs for In-memory Database Query Processing

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 7, Issue 11

July 2014

92 pages

ISSN:2150-8097

Editors:
H. V. Jagadish
University of Michigan
,
Aoying Zhou
East Normal University, China

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 July 2014

Published in PVLDB Volume 7, Issue 11

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

55
Total Citations
View Citations
346
Total Downloads

Downloads (Last 12 months)30
Downloads (Last 6 weeks)2

Reflects downloads up to 28 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang FLee RTeng DZhang XSaltz J(2024)High-Performance Spatial Data Analytics: Systematic R&D for Scale-Out and Scale-Up Solutions from the Past to NowProceedings of the VLDB Endowment10.14778/3685800.368591217:12(4507-4520)Online publication date: 8-Nov-2024
https://dl.acm.org/doi/10.14778/3685800.3685912
Zhang YZhang FLi HZhang SGuo XChen YPan ADu X(2024)Data-Aware Adaptive Compression for Stream ProcessingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337771036:9(4531-4549)Online publication date: 19-Mar-2024
https://dl.acm.org/doi/10.1109/TKDE.2024.3377710
Henneberg JSchuhknecht F(2023)RTIndeX: Exploiting Hardware-Accelerated GPU Raytracing for Database IndexingProceedings of the VLDB Endowment10.14778/3625054.362506316:13(4268-4281)Online publication date: 1-Sep-2023
https://dl.acm.org/doi/10.14778/3625054.3625063
Nie XLiu YFu FXue JJiao DMiao XTao YCui B(2023)Angel-PTM: A Scalable and Economical Large-Scale Pre-Training System in TencentProceedings of the VLDB Endowment10.14778/3611540.361156416:12(3781-3794)Online publication date: 1-Aug-2023
https://dl.acm.org/doi/10.14778/3611540.3611564
McCoy HHofmeyr SYelick KPandey PDehnavi MKulkarni MKrishnamoorthy S(2023)High-Performance Filters for GPUsProceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3572848.3577507(160-173)Online publication date: 25-Feb-2023
https://dl.acm.org/doi/10.1145/3572848.3577507
Chai CWang JLuo YNiu ZLi G(2023)Data Management for Machine Learning: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.314823735:5(4646-4667)Online publication date: 1-May-2023
https://dl.acm.org/doi/10.1109/TKDE.2022.3148237
Sun ZLi ZWeng C(2023)Co-Utilizing SIMD and Scalar to Accelerate the Data Analytics Workloads2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00387(637-649)Online publication date: May-2023
https://doi.org/10.1109/ICDE55515.2023.00387
Saied-Walker JGupta PYi RMarchal NSkubic MScott G(2023)GPU-accelerated PostgreSQL for Scalable Management and Processing of Irregular Time-Series Data using SPI2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386390(2541-2549)Online publication date: 15-Dec-2023
https://doi.org/10.1109/BigData59044.2023.10386390
AN JSUH YTAK B(2022)Workload-Driven Analysis on the Performance Characteristics of GPU-Accelerated DBMSesIEICE Transactions on Information and Systems10.1587/transinf.2022EDL8008E105.D:11(1984-1989)Online publication date: 1-Nov-2022
https://doi.org/10.1587/transinf.2022EDL8008
Yogatama BGong WYu X(2022)Orchestrating data placement and query execution in heterogeneous CPU-GPU DBMSProceedings of the VLDB Endowment10.14778/3551793.355180915:11(2491-2503)Online publication date: 1-Jul-2022
https://dl.acm.org/doi/10.14778/3551793.3551809
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents