skip to main content
10.1145/3294052.3319697acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article
Public Access

Distributed and Streaming Linear Programming in Low Dimensions

Published: 25 June 2019 Publication History

Abstract

We study linear programming and general LP-type problems in several big data (streaming and distributed) models. We mainly focus on low dimensional problems in which the number of constraints is much larger than the number of variables. Low dimensional LP-type problems appear frequently in various machine learning tasks such as robust regression, support vector machines, and core vector machines. As supporting large-scale machine learning queries in database systems has become an important direction for database research, obtaining efficient algorithms for low dimensional LP-type problems on massive datasets is of great value. In this paper we give both upper and lower bounds for LP-type problems in distributed and streaming models. Our bounds are almost tight when the dimensionality of the problem is a fixed constant.

References

[1]
Kook Jin Ahn and Sudipto Guha. Linear programming in the semi-streaming model with application to the maximum matching problem. In ICALP 2011, pages 526--538, 2011.
[2]
Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. J. Comput. Syst. Sci., 58(1):137--147, 1999.
[3]
Molham Aref, Balder ten Cate, Todd J. Green, Benny Kimelfeld, Dan Olteanu, Emir Pasalic, Todd L. Veldhuizen, and Geoffrey Washburn. Design and implementation of the logicblox system. In SIGMOD, pages 1371--1382, 2015.
[4]
Sepehr Assadi, Sanjeev Khanna, and Yang Li. Tight bounds for single-pass streaming complexity of the set cover problem. In STOC, pages 698--711, 2016.
[5]
Paul Beame, Paraschos Koutris, and Dan Suciu. Communication steps for parallel query processing. J. ACM, 64(6):40:1--40:58, 2017.
[6]
Robert G Bland and Michel Las Vergnas. Orientability of matroids. Journal of Combinatorial Theory, Series B, 24(1):94--123, 1978.
[7]
Bernhard E. Boser, Isabelle Guyon, and Vladimir Vapnik. A training algorithm for optimal margin classifiers. In COLT, pages 144--152, 1992.
[8]
Hervé Brö nnimann, Bernard Chazelle, and Jivr 'i Matouvs ek. Product range spaces, sensitive sampling, and derandomization. SIAM J. Comput., 28(5):1552--1575, 1999.
[9]
Hervé Brö nnimann and Michael T. Goodrich. Almost optimal set covers in finite vc-dimension. Discrete & Computational Geometry, 14(4):463--479, 1995.
[10]
Christopher J. C. Burges. A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov., 2(2):121--167, 1998.
[11]
Amit Chakrabarti, Graham Cormode, and Andrew McGregor. Robust lower bounds for communication and stream computation. In STOC, pages 641--650, 2008.
[12]
Timothy M. Chan. Improved deterministic algorithms for linear programming in low dimensions. In SODA, pages 1213--1219, 2016.
[13]
Timothy M. Chan and Eric Y. Chen. Multi-pass geometric algorithms. Discrete & Computational Geometry, 37(1):79--102, 2007.
[14]
M. T. Chao. A general purpose unequal probability sampling plan. Biometrika, 69:653--656, 1982.
[15]
Kenneth L. Clarkson. Linear programming in $o(n 3^d^2)$ time. Inf. Process. Lett., 22(1):21--24, 1986.
[16]
Kenneth L. Clarkson. Las vegas algorithms for linear and integer programming when the dimension is small. J. ACM, 42(2):488--499, 1995.
[17]
Kenneth L. Clarkson and Peter W. Shor. Application of random sampling in computational geometry, II. Discrete & Computational Geometry, 4:387--421, 1989.
[18]
Thomas M. Cover and Joy A. Thomas. Elements of information theory (2. ed.). Wiley, 2006.
[19]
David J. Crisp and Christopher J. C. Burges. A geometric interpretation of v-svm classifiers. In NIPS, pages 244--250, 1999.
[20]
Martin E. Dyer. On a multidimensional search technique and its application to the euclidean one-centre problem. SIAM J. Comput., 15(3):725--738, 1986.
[21]
Martin E. Dyer and Alan M. Frieze. A randomized algorithm for fixed-dimensional linear programming. Math. Program., 44(1--3):203--212, 1989.
[22]
Bernd G"a rtner and Martin Jaggi. Coresets for polytope distance. In SOCG, pages 33--42, 2009.
[23]
Michael T. Goodrich, Nodari Sitchinava, and Qin Zhang. Sorting, searching, and simulation in the mapreduce framework. In ISAAC, pages 374--383, 2011.
[24]
Sudipto Guha and Andrew McGregor. Tight lower bounds for multi-pass stream computation via pass elimination. In ICALP, pages 760--772, 2008.
[25]
David Haussler and Emo Welzl. epsilon-nets and simplex range queries. Discrete & Computational Geometry, 2:127--151, 1987.
[26]
Hal Daumé III, Jeff M. Phillips, Avishek Saha, and Suresh Venkatasubramanian. Efficient protocols for distributed classification and optimization. In ALT, pages 154--168, 2012.
[27]
Piotr Indyk, Sepideh Mahabadi, Ronitt Rubinfeld, Jonathan Ullman, Ali Vakilian, and Anak Yodpinyanee. Fractional set cover in the streaming model. In APPROX/RANDOM 2017, pages 12:1--12:20, 2017.
[28]
Gil Kalai. A subexponential randomized simplex algorithm (extended abstract). In STOC, pages 475--482, 1992.
[29]
Howard J. Karloff, Siddharth Suri, and Sergei Vassilvitskii. A model of computation for mapreduce. In SODA, pages 938--948, 2010.
[30]
Yin Tat Lee and Aaron Sidford. Path finding methods for linear programming: Solving linear programs in o (vrank) iterations and faster algorithms for maximum flow. In FOCS, pages 424--433, 2014.
[31]
Makrynioti, Nantia and Vasiloglou, Nikolaos and Pasalic, Emir and Vassalos, Vasilis. Data Science with Linear Programming. http://delbp.github.io/DeLBP-2017/papers/DeLBP-2017_paper_2CR.pdf, 2017.
[32]
Jivr 'i Matouvs ek, Micha Sharir, and Emo Welzl. A subexponential bound for linear programming. Algorithmica, 16(4/5):498--516, 1996.
[33]
Andrew McGregor. private communication.
[34]
Nimrod Megiddo. Linear programming in linear time when the dimension is fixed. J. ACM, 31(1):114--127, 1984.
[35]
Peter Bro Miltersen, Noam Nisan, Shmuel Safra, and Avi Wigderson. On data structures and asymmetric communication complexity. J. Comput. Syst. Sci., 57(1):37--49, 1998.
[36]
Ketan Mulmuley. Computational geometry - an introduction through randomized algorithms. Prentice Hall, 1994.
[37]
J. Ian Munro and Mike Paterson. Selection and sorting with limited storage. Theor. Comput. Sci., 12:315--323, 1980.
[38]
Jeff M. Phillips, Elad Verbin, and Qin Zhang. Lower bounds for number-in-hand multiparty communication complexity, made easy. In SODA, pages 486--501, 2012.
[39]
Tim Roughgarden, Sergei Vassilvitskii, and Joshua R. Wang. Shuffles and circuits: (on lower bounds for modern parallel computation). In SPAA, pages 1--12, 2016.
[40]
Pranab Sen and Srinivasan Venkatesh. Lower bounds for predecessor searching in the cell probe model. J. Comput. Syst. Sci., 74(3):364--385, 2008.
[41]
Yufei Tao. Massively parallel entity matching with linear classification in low dimensional space. In ICDT, pages 20:1--20:19, 2018.
[42]
Ivor W. Tsang, James T. Kwok, and Pak-Ming Cheung. Core vector machines: Fast SVM training on very large data sets. Journal of Machine Learning Research, 6:363--392, 2005.
[43]
Vladimir N Vapnik and A Ya Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. In Measures of complexity, pages 11--30. Springer, 2015.
[44]
Roberta S. Wenocur and Richard M. Dudley. Some special vapnik-chervonenkis classes. Discrete Mathematics, 33(3):313--318, 1981.
[45]
Andrew Chi-Chih Yao. Some complexity questions related to distributive computing (preliminary report). In STOC, pages 209--213, 1979.
[46]
Andrew Chi-Chih Yao. Lower bounds by probabilistic arguments (extended abstract). In FOCS, pages 420--428, 1983.
[47]
Yinyu Ye and Edison Tse. An extension of karmarkar's projective algorithm for convex quadratic programming. Math. Program., 44(1--3):157--179, 1989.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '19: Proceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
June 2019
494 pages
ISBN:9781450362276
DOI:10.1145/3294052
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 June 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. distributed algorithms
  2. linear programming
  3. streaming algorithms

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS '19
Sponsor:
SIGMOD/PODS '19: International Conference on Management of Data
June 30 - July 5, 2019
Amsterdam, Netherlands

Acceptance Rates

PODS '19 Paper Acceptance Rate 29 of 87 submissions, 33%;
Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)85
  • Downloads (Last 6 weeks)19
Reflects downloads up to 01 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media