skip to main content
research-article

<sc>Springald</sc>: GPU-Accelerated Window-Based Aggregates Over Out-of-Order Data Streams

Published: 01 September 2024 Publication History

Abstract

An increasing number of application domains require high-throughput processing to extract insights from massive data streams. The Data Stream Processing (DSP) paradigm provides formal approaches to analyze structured data streams considered as special, unbounded relations. The most used class of stateful operators in DSP are the ones running sliding-window aggregation, which continuously extracts insights from the most recent portion of the stream. This article presents <sc>Springald</sc>, an efficient sliding-window operator leveraging GPU devices. <sc>Springald</sc>, incorporated in the <sc>WindFlow</sc> parallel library, processes out-of-order data streams with watermarks propagation. These two features&#x2014;GPU processing and out-of-orderliness&#x2014;make <sc>Springald</sc> a novel contribution to this research area. This article describes the methodology behind <sc>Springald</sc>, its design and implementation. We also provide an extensive experimental evaluation to understand the behavior of <sc>Springald</sc> deeply, and we showcase its superior performance against state-of-the-art competitors.

References

[1]
H. C. M. Andrade, B. Gedik, and D. S. Turaga, Fundamentals of Stream Processing: Application Design, Systems, and Analytics. Cambridge, U.K.: Cambridge Univ. Press, 2014.
[2]
A. Arasu, S. Babu, and J. Widom, “The CQL continuous query language: Semantic foundations and query execution,” VLDB J., vol. 15, no. 2, pp. 121–142, 2006. [Online]. Available: https://doi.org/10.1007/s00778%E2%80%93004-0147-z
[3]
J. Verwiebe, P. M. Grulich, J. Traub, and V. Markl, “Algorithms for windowed aggregations and joins on distributed stream processing systems,” Datenbank-Spektrum, vol. 22, no. 2, pp. 99–107, 2022. [Online]. Available: https://doi.org/10.1007/s13222%E2%80%93022-00417-y
[4]
D. Abadi et al., “Aurora: A data stream management system,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, New York, NY, USA, 2003, Art. no. [Online]. Available: https://doi.org/10.1145/872757.872855
[5]
Y. Ahmad et al., “Distributed operation in the borealis stream processing engine,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, New York, NY, USA, 2005, pp. 882–884. [Online]. Available: https://doi.org/10.1145/1066157.1066274
[6]
J. Li, K. Tufte, V. Shkapenyuk, V. Papadimos, T. Johnson, and D. Maier, “Out-of-order processing: A new architecture for high-performance stream systems,” Proc. VLDB Endowment, vol. 1, no. 1, pp. 274–288, Aug. 2008. [Online]. Available: https://doi.org/10.14778/1453856.1453890
[7]
A. Storm, “Apache storm,” 2023. [Online]. Available: https://storm.apache.org/
[8]
A. Flink, “Apache Flink: Stateful computations over data streams,” 2023. [Online]. Available: https://flink.apache.org/
[9]
J. Li, D. Maier, K. Tufte, V. Papadimos, and P. A. Tucker, “Semantics and evaluation techniques for window aggregates in data streams,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, New York, NY, USA, 2005, pp. 311–322. [Online]. Available: https://doi.org/10.1145/1066157.1066193
[10]
J. Li, D. Maier, K. Tufte, V. Papadimos, and P. A. Tucker, “No pane, no gain: Efficient evaluation of sliding-window aggregates over data streams,” SIGMOD Rec., vol. 34, no. 1, pp. 39–44, Mar. 2005. [Online]. Available: https://doi.org/10.1145/1058150.1058158
[11]
S. Krishnamurthy, C. Wu, and M. Franklin, “On-the-fly sharing for streamed aggregation,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, New York, NY, USA, 2006, pp. 623–634. [Online]. Available: https://doi.org/10.1145/1142473.1142543
[12]
K. Tangwongsan, M. Hirzel, S. Schneider, and K.-L. Wu, “General incremental sliding-window aggregation,” Proc. VLDB Endowment, vol. 8, no. 7, pp. 702–713, Feb. 2015. [Online]. Available: https://doi.org/10.14778/2752939.2752940
[13]
A. U. Shein, P. K. Chrysanthis, and A. Labrinidis, “SlickDeque: High throughput and low latency incremental sliding-window aggregation,” in Proc. 21st Int. Conf. Extending Database Technol., M. H. Böhlen, R. Pichler, N. May, E. Rahm, S. Wu, and K. Hose, Eds., 2018, pp. 397–408. [Online]. Available: https://doi.org/10.5441/002/edbt.2018.35
[14]
G. Theodorakis, P. R. Pietzuch, and H. Pirk, “SlideSide: A fast incremental stream processing algorithm for multiple queries,” in Proc. 23rd Int. Conf. Extending Database Technol., 2020, pp. 435–438. [Online]. Available: https://doi.org/10.5441/002/edbt.2020.51
[15]
K. Tangwongsan, M. Hirzel, and S. Schneider, “Low-latency sliding-window aggregation in worst-case constant time,” in Proc. 11th ACM Int. Conf. Distrib. Event-Based Syst., New York, NY, USA, 2017, pp. 66–77. [Online]. Available: https://doi.org/10.1145/3093742.3093925
[16]
K. Tangwongsan, M. Hirzel, and S. Schneider, “Optimal and general out-of-order sliding-window aggregation,” Proc. VLDB Endowment, vol. 12, no. 10, pp. 1167–1180, Jun. 2019. [Online]. Available: https://doi.org/10.14778/3339490.3339499
[17]
H. Miao, H. Park, M. Jeon, G. Pekhimenko, K. S. McKinley, and F. X. Lin, “StreamBox: Modern stream processing on a multicore machine,” in Proc. USENIX Conf. Usenix Annu. Tech. Conf., 2017, pp. 617–629.
[18]
G. Theodorakis, A. Koliousis, P. Pietzuch, and H. Pirk, “LightSaber: Efficient window aggregation on multi-core processors,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, New York, NY, USA, 2020, pp. 2505–2521. [Online]. Available: https://doi.org/10.1145/3318464.3389753
[19]
P. M. Grulich et al., “Grizzly: Efficient stream processing through adaptive query compilation,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, New York, NY, USA, 2020, pp. 2487–2503. [Online]. Available: https://doi.org/10.1145/3318464.3389739
[20]
A. Koliousis, M. Weidlich, R. Castro Fernandez, A. L. Wolf, P. Costa, and P. Pietzuch, “SABER: Window-based hybrid stream processing for heterogeneous architectures,” in Proc. Int. Conf. Manage. Data, New York, NY, USA, 2016, pp. 555–569. [Online]. Available: https://doi.org/10.1145/2882903.2882906
[21]
F. Zhang, L. Yang, S. Zhang, B. He, W. Lu, and X. Du, “FineStream: Fine-grained window-based stream processing on CPU-GPU integrated architectures,” in Proc. USENIX Conf. Usenix Annu. Tech. Conf., 2020, Art. no.
[22]
J. Traub et al., “Scotty: General and efficient open-source window aggregation for stream processing systems,” ACM Trans. Database Syst., vol. 46, 2021, Art. no.
[23]
J. Verwiebe, P. M. Grulich, J. Traub, and V. Markl, “Survey of window types for aggregation in stream processing systems,” VLDB J., vol. 32, pp. 985–1011, 2023. [Online]. Available: https://doi.org/10.1007/s00778%E2%80%93022-00778-6
[24]
B. He et al., “Relational query coprocessing on graphics processors,” ACM Trans. Database Syst., vol. 34, no. 4, Dec. 2009, Art. no. [Online]. Available: https://doi.org/10.1145/1620585.1620588
[25]
G. Mencagli, M. Torquati, A. Cardaci, A. Fais, L. Rinaldi, and M. Danelutto, “WindFlow: High-speed continuous stream processing with parallel building blocks,” IEEE Trans. Parallel Distrib. Syst., vol. 32, no. 11, pp. 2748–2763, Nov. 2021.
[26]
M. Aldinucci, M. Danelutto, P. Kilpatrick, and M. Torquati, Fastflow: High-Level and Efficient Streaming on Multicore. Hoboken, NJ, USA: Wiley, 2017, pp. 261–280. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/9781119332015.ch13
[27]
S. Chintapalli et al., “Benchmarking streaming computation engines: Storm, flink and spark streaming,” in Proc. IEEE Int. Parallel Distrib. Process. Symp. Workshops, 2016, pp. 1789–1792.
[28]
S. Zhang, B. He, D. Dahlmeier, A. C. Zhou, and T. Heinze, “Revisiting the design of data stream processing systems on multi-core processors,” in Proc. IEEE 33rd Int. Conf. Data Eng., 2017, pp. 659–670.
[29]
G. Theodorakis, A. Koliousis, P. R. Pietzuch, and H. Pirk, “Hammer slide: Work- and CPU-efficient streaming window aggregation,” in Proc. Int. Workshop Accelerating Anal. Data Manage. Syst. Using Modern Processor Storage Architectures, 2018, pp. 34–41.
[30]
C. Zhang, R. Akbarinia, and F. Toumani, “Efficient incremental computation of aggregations over sliding windows,” in Proc. 27th ACM SIGKDD Conf. Knowl. Discov. Data Mining, New York, NY, USA, 2021, pp. 2136–2144. [Online]. Available: https://doi.org/10.1145/3447548.3467360
[31]
S. Bou, H. Kitagawa, and T. Amagasa, “CPiX: Real-time analytics over out-of-order data streams by incremental sliding-window aggregation,” IEEE Trans. Knowl. Data Eng., vol. 34, no. 11, pp. 5239–5250, Nov. 2022.
[32]
R. Wesley and F. Xu, “Incremental computation of common windowed holistic aggregates,” Proc. VLDB Endowment, vol. 9, no. 12, pp. 1221–1232, Aug. 2016. [Online]. Available: https://doi.org/10.14778/2994509.2994537
[33]
B. Gedik, “Generic windowing support for extensible stream processing systems,” Softw. Pract. Experience, vol. 44, no. 9, pp. 1105–1128, 2014. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/spe.2194
[34]
G. Mencagli, M. Torquati, F. Lucattini, S. Cuomo, and M. Aldinucci, “Harnessing sliding-window execution semantics for parallel stream processing,” J. Parallel Distrib. Comput., vol. 116, pp. 74–88, 2018. [Online]. Available: https://doi.org/10.1016/j.jpdc.2017.10.021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems  Volume 35, Issue 9
Sept. 2024
166 pages

Publisher

IEEE Press

Publication History

Published: 01 September 2024

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media