skip to main content
10.1145/2882903.2882931acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Open access

Principled Evaluation of Differentially Private Algorithms using DPBench

Published: 14 June 2016 Publication History

Editorial Notes

Computationally Replicable. The experimental results of this paper were replicated by a SIGMOD Review Committee and were found to support the central results reported in the paper. Details of the review process are found here

Abstract

Differential privacy has become the dominant standard in the research community for strong privacy protection. There has been a flood of research into query answering algorithms that meet this standard. Algorithms are becoming increasingly complex, and in particular, the performance of many emerging algorithms is data dependent, meaning the distribution of the noise added to query answers may change depending on the input data. Theoretical analysis typically only considers the worst case, making empirical study of average case performance increasingly important. In this paper we propose a set of evaluation principles which we argue are essential for sound evaluation. Based on these principles we propose DPBench, a novel evaluation framework for standardized evaluation of privacy algorithms. We then apply our benchmark to evaluate algorithms for answering 1- and 2-dimensional range queries. The result is a thorough empirical study of 15 published algorithms on a total of 27 datasets that offers new insights into algorithm behavior---in particular the influence of dataset scale and shape---and a more complete characterization of the state of the art. Our methodology is able to resolve inconsistencies in prior empirical studies and place algorithm performance in context through comparison to simple baselines. Finally, we pose open research questions which we hope will guide future algorithm design.

Supplementary Material

ReadMe (readme.txt)
Rights information
Reproducibility (reprozip_mod392_release.zip)
Quick Start Script

References

[1]
G. Ács, C. Castelluccia, and R. Chen. Differentially private histogram publishing through lossy compression. In ICDM, pages 1--10, 2012.
[2]
A. Blum, K. Ligett, and A. Roth. A learning theory approach to noninteractive database privacy. Journal of the ACM (JACM), 60(2):12, 2013.
[3]
K. Chaudhuri and S. A. Vinterbo. A stability-based validation procedure for differentially private machine learning. In Advances in Neural Information Processing Systems, pages 2652--2660, 2013.
[4]
G. Cormode, M. Procopiuc, E. Shen, D. Srivastava, and T. Yu. Differentially private spatial decompositions. In ICDE, pages 20--31, 2012.
[5]
C. Dwork. Differential privacy: A survey of results. In TAMC, 2008.
[6]
C. Dwork. A firm foundation for private data analysis. Communications of the ACM, 54(1):86--95, 2011.
[7]
C. Dwork, F. M. K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC, pages 265--284, 2006.
[8]
C. Dwork and A. Roth. The Algorithmic Foundations of Differential Privacy. Found. and Trends in Theoretical Computer Science, 2014.
[9]
M. Fredrikson, E. Lantz, S. Jha, S. Lin, D. Page, and T. Ristenpart. Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing. In USENIX Security, 2014.
[10]
M. Hardt, K. Ligett, and F. McSherry. A simple and practical algorithm for differentially private data release. In NIPS, 2012.
[11]
M. Hay, V. Rastogi, G. Miklau, and D. Suciu. Boosting the accuracy of differentially private histograms through consistency. PVLDB, 3(1--2):1021--1032, 2010.
[12]
X. He, G. Cormode, A. Machanavajjhala, C. M. Procopiuc, and D. Srivastava. Dpt: differentially private trajectory synthesis using hierarchical reference systems. Proceedings of the VLDB Endowment, 8(11):1154--1165, 2015.
[13]
X. Hu, M. Yuan, J. Yao, Y. Deng, L. Chen, Q. Yang, H. Guan, and J. Zeng. Differential privacy in telco big data platform. Proc. VLDB Endow., 8(12):1692--1703, Aug. 2015.
[14]
H. V. Jagadish, N. Koudas, S. Muthukrishnan, V. Poosala, K. C. Sevcik, and T. Suel. Optimal histograms with quality guarantees. In VLDB, pages 275--286, 1998.
[15]
C. Li, M. Hay, and G. Miklau. A data- and workload-aware algorithm for range queries under differential privacy. PVLDB, 2014.
[16]
C. Li, M. Hay, V. Rastogi, G. Miklau, and A. McGregor. Optimizing linear counting queries under differential privacy. In PODS, pages 123--134, 2010.
[17]
C. Li and G. Miklau. An adaptive mechanism for accurate query answering under differential privacy. PVLDB, 5(6):514--525, 2012.
[18]
C. Li, G. Miklau, M. Hay, A. McGregor, and V. Rastogi. The matrix mechanism: optimizing linear counting queries under differential privacy. The VLDB Journal, pages 1--25, 2015.
[19]
F. D. McSherry. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In SIGMOD, pages 19--30, 2009.
[20]
M. Piorkowski, N. Sarafijanovic-Djukic, and M. Grossglauser. CRAWDAD dataset epfl/mobility (v. 2009-02--24). Downloaded from http://crawdad.org/epfl/mobility/20090224, Feb. 2009.
[21]
W. Qardaji, W. Yang, and N. Li. Differentially private grids for geospatial data. In Data Engineering (ICDE), 2013 IEEE 29th International Conference on, pages 757--768. IEEE, 2013.
[22]
W. Qardaji, W. Yang, and N. Li. Understanding hierarchical methods for differentially private histograms. PVLDB, 6(14), 2013.
[23]
P. A. Sandercock, M. Niewada, A. Członkowska, et al. The international stroke trial database. Trials, 12(1):1--7, 2011.
[24]
X. Xiao, G. Bender, M. Hay, and J. Gehrke. ireduct: Differential privacy with reduced relative errors. In SIGMOD, 2011.
[25]
X. Xiao, G. Wang, and J. Gehrke. Differential privacy via wavelet transforms. In ICDE, pages 225--236, 2010.
[26]
Y. Xiao, L. Xiong, L. Fan, S. Goryczka, and H. Li. DPCube: Differentially private histogram release through multidimensional partitioning. Transactions of Data Privacy, 7(3), 2014.
[27]
J. Xu, Z. Zhang, X. Xiao, Y. Yang, G. Yu, and M. Winslett. Differentially private histogram publication. The VLDB Journal, pages 1--26, 2013.
[28]
G. Yuan, Z. Zhang, M. Winslett, X. Xiao, Y. Yang, and Z. Hao. Low-rank mechanism: Optimizing batch queries under differential privacy. PVLDB, 5(11):1136--1147, 2012.
[29]
X. Zhang, R. Chen, J. Xu, X. Meng, and Y. Xie. Towards accurate histogram publication under differential privacy. In ICDM, 2014.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data
June 2016
2300 pages
ISBN:9781450335317
DOI:10.1145/2882903
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication Notes

Badge change: Article originally badged under Version 1.0 guidelines https://www.acm.org/publications/policies/artifact-review-badging

Publication History

Published: 14 June 2016

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. algorithm evaluation
  2. differential privacy
  3. privacy

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS'16
Sponsor:
SIGMOD/PODS'16: International Conference on Management of Data
June 26 - July 1, 2016
California, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)190
  • Downloads (Last 6 weeks)25
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media