skip to main content
10.1145/2588555.2593681acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

A software-defined networking based approach for performance management of analytical queries on distributed data stores

Published: 18 June 2014 Publication History

Abstract

Nowadays data analytics applications are accessing more and more data from distributed data stores, creating a large amount of data traffic on the network. Therefore, distributed analytic queries are prone to suffer from poor performance when they encounter network contention, which can be quite common in a shared network. Typical distributed query optimizers do not have a way to solve this problem because they treat the network as a black-box: they are unable to monitor it, let alone control it. With the new era of software-defined networking (SDN), we show how SDN can be effectively exploited for performance management for analytical queries in distributed data store environments. More specifically, we present a group of methods to leverage SDN's visibility into and control of the network's state that enable distributed query processors to achieve performance improvements and differentiation for analytical queries. We demonstrate the effectiveness of the methods through detailed experimental studies on a system running on a software-defined network with commercial switches. To the best of our knowledge, this is the first work to analyze and show the opportunities of SDN for distributed query optimization. It is our hope that this will open up a rich area of research and technology development in distributed data intensive computing.

References

[1]
M. Akdere, U. Çetintemel, M. Riondato, E. Upfal, and S. Zdonik. Learning-based query performance modeling and prediction. In Proc. of ICDE, 2012.
[2]
M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat. Hedera: Dynamic flow scheduling for data center networks. In Proc. of NSDI, 2010.
[3]
L. Amsaleg, M. J. Franklin, A. Tomasic, and T. Urhan. Scrambling query plans to cope with unexpected delays. In Proc. of PDIS, 1996.
[4]
R. Avnur and J. M. Hellerstein. Eddies: continuously adaptive query processing. In Proc. of SIGMOD, 2000.
[5]
P. Bernstein and D.-M. Chiu. Using Semi-Joins to Solve Relational Queries. JACM, 28:25--40, 1981.
[6]
B. Chandramouli, C. Bond, S. Babu, and J. Yang. Query suspend and resume. In Proc. of SIGMOD, 2007.
[7]
S. Chaudhuri and U. Dayal. An overview of data warehousing and olap technology. SIGMOD Record, 26:65--74, 1997.
[8]
R. L. Cole and G. Graefe. Optimization of dynamic query evaluation plans. In Proc. of SIGMOD, 1994.
[9]
A. D. Ferguson, A. Guha, C. Liang, R. Fonseca, and S. Krishnamurthi. Participatory networking: An api for application control of sdns. In Proc. of SIGCOMM, 2013.
[10]
M. J. Franklin, B. T. Jónsson, and D. Kossmann. Performance tradeoffs for client-server query processing. In Proc. of SIGMOD, 1996.
[11]
W. Kim, P. Sharma, J. Lee, S. Banerjee, J. Tourrilhes, S.-J. Lee, and P. Yalagandula. Automated and scalable qos control for network convergence. In Proc. of INM/WREN, 2010.
[12]
D. Kossmann. The state of the art in distributed query processing. ACM Comput. Surv., 32(4), Dec. 2000.
[13]
N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner. Openflow: enabling innovation in campus networks. SIGCOMM Comput. Commun. Rev., 2008.
[14]
K. Morton, R. Bunker, J. D. Mackinlay, R. Morton, and C. Stolte. Dynamic workload driven data integration in tableau. In Proc. of SIGMOD, 2012.
[15]
Open Networking Foundation. Software-Defined Networking: The New Norm for Networks. 2013.
[16]
A. Shieh, S. Kandula, A. Greenberg, C. Kim, and B. Saha. Sharing the data center network. In Proc. of NSDI, 2011.
[17]
A. Simitsis, K. Wilkinson, M. Castellanos, and U. Dayal. Optimizing analytic data flows for multiple execution engines. In Proc. of SIGMOD, 2012.
[18]
T. Urhan and M. J. Franklin. Xjoin: A reactively-scheduled pipelined join operator. IEEE Data Enginerring Bulletin, 23(2):27--33, 2000.
[19]
T. Urhan, M. J. Franklin, and L. Amsaleg. Cost-based query scrambling for initial delays. In Proc. of SIGMOD, 1998.
[20]
G. Wang, T. E. Ng, and A. Shaikh. Programming your network at run-time for big data applications. In Proc. of HotSDN, 2012.
[21]
W. Wu, Y. Chi, S. Zhu, J. Tatemura, H. Hacıgümüş, and J. F. Naughton. Predicting query execution time: Are optimizer cost models really unusable? In Proc. of ICDE, 2013.
[22]
K.-K. Yap, T.-Y. Huang, B. Dodson, M. S. Lam, and N. McKeown. Towards software-friendly networks. In Proc. of APSys, 2010.

Cited By

View all

Index Terms

  1. A software-defined networking based approach for performance management of analytical queries on distributed data stores

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data
    June 2014
    1645 pages
    ISBN:9781450323765
    DOI:10.1145/2588555
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 June 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. analytical queries
    2. distributed data stores
    3. software-defined networking

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS'14
    Sponsor:

    Acceptance Rates

    SIGMOD '14 Paper Acceptance Rate 107 of 421 submissions, 25%;
    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)10
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 25 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media