skip to main content
article
Free access

Secure statistical databases with random sample queries

Published: 01 September 1980 Publication History

Abstract

A new inference control, called random sample queries, is proposed for safeguarding confidential data in on-line statistical databases. The random sample queries control deals directly with the basic principle of compromise by making it impossible for a questioner to control precisely the formation of query sets. Queries for relative frequencies and averages are computed using random samples drawn from the query sets. The sampling strategy permits the release of accurate and timely statistics and can be implemented at very low cost. Analysis shows the relative error in the statistics decreases as the query set size increases; in contrast, the effort required to compromise increases with the query set size due to large absolute errors. Experiments performed on a simulated database support the analysis.

References

[1]
ACHUGBUE, J. O., AND CHIN, F.Y. Output perturbation for protection of statistical data bases. Dep. Computing Science, Univ. Alberta, Alberta, Canada, Jan. 1978.
[2]
BECK, L.L. A security mechanism for statistical databases. A CM Trans. Database Syst. 5, 3 (Sept. 1980), 316-338.
[3]
BORUCH, R.F. Maintaining confidentiality in educational research: A systematic analysis. Am. Psychol. 26 (1971), 413-430.
[4]
CAMPBELL, D. T., BORUCH, R. F., SCHWARTZ, R. D., AND STEINBERG, J. Confidentialitypreserving modes of access to files and to interfile exchange for useful statistical analysis. Eval. Quart. 1, 2 (May 1977), 269-299.
[5]
CHIN, F.Y. Security in statistical databases for queries with small counts. ACM Trans. Database Syst. 3, 1 (March 1978), 92-104.
[6]
DALENIUS, T. Towards a methodology for statistical disclosure control. Sdrtryck ur Statistisk tidskrift 15 (1977), 429-444.
[7]
DALENIUS, T., AND REISS, S.P. Data-swapping--A technique for disclosure control. Confidentiality in Surveys, Rep. 31, Dep. Star., Univ. Stockholm, Stockholm, Sweden, May 1978.
[8]
DAVIDA, G. I., ET AL. Data base security. IEEE Trans. Softw. Eng. SE-4, 6 (Nov. 1978), 531- 533.
[9]
DEMILLO, R. A., DOBKXN, D., AND LIPTON, R.J. Even data bases that lie can be compromised. IEEE Trans. Softw. Eng. SE-4, 1 (Jan. 1978), 73-75.
[10]
DENNINg, D.E. A review of research on statistical database security. In Foundations of Secure Computation, R. A. DeMillo et al., Eds. Academic, New York, 1978.
[11]
DF~NNING, D.E. Are statistical data bases secure? Proc. AFIPS 1978 NCC, vol. 47, AFIPS Press, Arlington, Va., pp. 525-530.
[12]
DENNING, D. E., AND DENNING, P.J. Data security. Comput. Surv. 11, 3 (Sept. I979), 227-249.
[13]
DENNING, D. E., DENNING, P. J., AND SCHWARTZ, M.D. The tracker: A threat to statistical database security. ACM Trans. Database Syst. 4, 1 (March 1979), 76-96.
[14]
DENNING, D. E., AND SCHLORER, J. A fast procedure for finding a tracker in a statistical database. ACM Trans. Database Syst. 5, 1 (March 1980), 88-102.
[15]
DENNING, D.E. Complexity results relating to statistical confidentiality. Computer Science and Statistics: 12th Ann. Symp. Interface, Waterloo, Canada, May 1979, pp. 252-256.
[16]
DOBKIN, D., JONES, A. K., AND LIPTON, R.J. Secure databases: Protection against user influence. ACM Trans. Database Syst. 4, 1 (March 1979), 97-I06.
[17]
FE{GE, E. L., AND WATTS, H. W. Protection of privacy through microaggregation. In Data Bases, Computers, and the Social Sciences, R. L. Bisco, Ed. Wiley-Interscience, New York, 1970.
[18]
FELLER, W. An Introduction to Probability Theory and Its Applications L Wiley, New York, I950.
[19]
FELLEGI, I. P., AND PHILLIPS, J.L. Statistical confidentiality: Some theory and applications to data dissemination. Ann. Econ. Soc. MeaN. 3, 2 (April 1974), 399-409.
[20]
HANSEN, M.H. Insuring confidentiality of individual records in data storage and retrieval for statistical purposes. Proc. AFIPS 1971 FJCC, vol. 39, AFIPS Press, Arlington, Va., pp. 579-585.
[21]
HAQ, M.I. On safeguarding statistical disclosure by giving approximate answers to queries. Int. Computing Symp., 1977, pp. 491-495.
[22]
HOFFMAN, L. J., AND MILLER, W.F. Getting a personal dossier from a statistical data bank. Datamation 16, 5 (May 1970), 74-75.
[23]
KAM, J. B., AND ULLMAN, J.D. A model of statistical databases and their security. ACM Trans. Database Syst. 2, 1 (March 1977), 1-10.
[24]
KARPINSKI, R.H. Reply to Hoffman and Shaw. Datamation 16, I0 {Oct. 1970), 11.
[25]
NARGUNDKAR, M. S., AND SAVELAND, W. Random rounding to prevent statistical disclosure. Proc. Am. Stat. Assoc., Soc. Stat. Sect. (1972), 382-385.
[26]
NATIONAL BUREAU OF STANDARDS. Data encryption standard. PIPS PUB. 46, Washington, D.C., Jan. 1977.
[27]
REED, I.S. Information theory and privacy in data banks. Proc. AFIPS 1973, vol. 42, AFIPS Press, Arlington, Va., pp. 581-587.
[28]
REINS, S.B. Medians and database security. In Foundations of Secure Computation, R. A. DeMillo et al., Eds. Academic, New York, 1978.
[29]
SCHLORER, J. Identification and retrieval of personal records from a statistical data bank. Methods Inform. Med. 14, 1 (Jan. 1975), 7-13.
[30]
SCHLORER, J. Confidentiality and security in statistical data banks. In Data Documentation: Some Principles and Applications in Science and Industry, W. Guas and R. Henzler, Eds. Proc. Workshop Data Documentation, 1975, Verl. Dok., Munchen, 1977, pp. 101-123.
[31]
SCHL6REI~, J. Disclosure from statistical databases: Quantitative aspects of trackers. Inst. Medizinische Statistik und Dokumentation, Univ. Giessen, Giessen, W. Germany, Mar. 1979. To appear in A CM Trans. Database Syst.
[32]
SCHL6RER, J. Security of statistical databases: Multidimensional transformation. Rep. TB- IMSD 2/78, Inst. Medizinische Statistik und Dokumentation, Univ. Giessen, Giessen, W. Germany, Mar. 1979.
[33]
SCHL6RER, J. Statistical database security: Some recent results. Inst. Medizinische Statistik und Dokumentation, Univ. Giessen, Giessen, W. Germany, 1979. Presented at Medical Informatics, Berlin, 1979.
[34]
SCHWARTZ, M. D., DENNING, D. E., AND DENNING, P.J. Securing data bases under linear queries. Proc. IFIP Congress 77, North-Holland, Amsterdam, 1977, pp. 395-398.
[35]
SCHWARTZ, M. D. Inference from statistical data bases. Ph.D. Dissertation, Dep. Computer Sciences, Purdue Univ., W. Lafayette, Ind., Aug. 1977.
[36]
SCHWARTZ, M. D., DENNING, D. E., AND DENNING, P.j. Linear queries in statistical databases. ACM Trans. Database Syst. 4, 2 (June 1979), 156-167.
[37]
Yu, C. T., AND CHIN, F.Y. A study on the protection of statistical data bases. ACM SIGMOD Int. Conf. Management of Data, 1977, pp. i69-181.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 5, Issue 3
Sept. 1980
142 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/320613
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 1980
Published in TODS Volume 5, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. confidentiality
  2. database security
  3. disclosure controls
  4. sampling
  5. statistical database

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)129
  • Downloads (Last 6 weeks)10
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media