skip to main content
10.1145/1390156.1390167acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Actively learning level-sets of composite functions

Published: 05 July 2008 Publication History

Abstract

Scientists frequently have multiple types of experiments and data sets on which they can test the validity of their parameterized models and locate plausible regions for the model parameters. By examining multiple data sets, scientists can obtain inferences which typically are much more informative than the deductions derived from each of the data sources independently. Several standard data combination techniques result in target functions which are a weighted sum of the observed data sources. Thus, computing constraints on the plausible regions of the model parameter space can be formulated as finding a level set of a target function which is the sum of observable functions. We propose an active learning algorithm for this problem which selects both a sample (from the parameter space) and an observable function upon which to compute the next sample. Empirical tests on synthetic functions and on real data for an eight parameter cosmological model show that our algorithm significantly reduces the number of samples required to identify the desired level-set.

References

[1]
Angluin, D. (1988). Queries and concept learning. Machine Learning, 2, 319--342.
[2]
Bennett, C. L., et al. (2003). First-Year Wilkinson Microwave Anisotropy Probe (WMAP) Observations: Foreground Emission. Astrophysical Journal Supplemental, 148, 97--117.
[3]
Bryan, B. (2007). Actively learning specific function properties with application to statistical inference. Doctoral dissertation, Carnegie Mellon University.
[4]
Bryan, B., et al. (2005). Active learning for identifying function threshold boundaries. In Advances in neural information processing systems 18. Cambridge, MA: MIT Press.
[5]
Cressie, N. (1991). Statistics for spatial data. New York: Wiley.
[6]
Davis, T. M., et al. (2007). Scrutinizing Exotic Cosmological Models Using ESSENCE Supernova Data Combined with Other Cosmological Probes. Astrophysical Journal, 666, 716.
[7]
Fisher, R. (1932). Statistical methods for research workers. London: Oliver and Boyd. 4 edition.
[8]
Guestrin, C., et al. (2005). Near-optimal sensor placements in gaussian processes. ICML 2005: Proceedings of the 22nd International Conference on Machine learning. ACM Press.
[9]
Hedges, L. V. (1985). Statistical methods for meta-analysis. Academic Press.
[10]
Kersting, K. et al. (2007). Most likely heteroscedastic gaussian process regression. ICML '07: Proceedings of the 24th International Conference on Machine Learning. ACM Press.
[11]
MacKay, D. (1992). Information-based objective functions for active data selection. Neural Computation, 4, 590.
[12]
Ramakrishnan, N., et al. (2005). Gaussian processes for active data mining of spatial aggregates. Proceedings of the SIAM International Conference on Data Mining.
[13]
Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian processes for machine learning. MIT Press.
[14]
Santner, T. J., Williams, B. J., & Notz, W. (2003). The design and analyis of computer experiments. Springer. 1 edition.
[15]
Shmueli, G., & Fienberg, S. E. (2006). Statistical methods in counterterrorism, chapter Current and Potential Statistical Methods for Monitoring Multiple Data Streams for Biosurveillance, 109. New York: Springer.
[16]
Spergel, D. et al. (2003). First-Year Wilkinson Microwave Anisotropy Probe Observations: Determination of Cosmological Parameters. Astrophysical Journal Supplemental, 148.
[17]
Tegmark, M., et al. (2006). Cosmological constraints from the SDSS luminous red galaxies. Physical Review D, 74, 123507.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICML '08: Proceedings of the 25th international conference on Machine learning
July 2008
1310 pages
ISBN:9781605582054
DOI:10.1145/1390156
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • Pascal
  • University of Helsinki
  • Xerox
  • Federation of Finnish Learned Societies
  • Google Inc.
  • NSF
  • Machine Learning Journal/Springer
  • Microsoft Research: Microsoft Research
  • Intel: Intel
  • Yahoo!
  • Helsinki Institute for Information Technology
  • IBM: IBM

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2008

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

ICML '08
Sponsor:
  • Microsoft Research
  • Intel
  • IBM

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media