skip to main content
article

Bayesian visual analytics: BaVA

Published: 01 February 2015 Publication History

Abstract

Leman et al. and Endert et al. develop an interactive data visualization framework called visual to parametric interaction V2PI. With V2PI, experts may explore data visually assess multiple data visualizations based on their judgments and an underlying data analytic method. Specifically, V2PI offers a deterministic procedure to quantify expert judgments and update analytical parameters to create new data visualizations. In this article, we explain V2PI from a probabilistic perspective and develop Bayesian visual analytics BaVA. We model data probabilistically, develop parallels between quantifying expert judgments and eliciting prior distributions from experts, and justify how we update parameters using Bayesian sequential updating. We apply BaVA using two linear projections methods to assess simulated and real-world datasets.

References

[1]
S. C.Leman, L.House, D.Maiti, A.Endert, C.North. Visual to parametric interactions V2PI. PLoS One 2013; Volume 8 Issue 3: pp.1-12.
[2]
A.Endert, C.Han, D.Maiti, L.House, S.Leman, C.North. Observation-level interaction with statistical models for visual analytics. In Visual Analytics Science and Technology VAST 2011 IEEE Conference, Providence, RI, 2011, pp.121-130.
[3]
K.Pearson. On lines and planes of closest fit to systems of points in space. Philos Mag 1901; Volume 6 Issue 2: pp.559-572.
[4]
I.Jolliffe 2nd ed. John Wiley and Sons: New York, NY; 2002.
[5]
A.Torokhti, S.Friedland. Towards theory of generic principal component analysis. J Multivar Anal 2009; Volume 100 Issue 4: pp.661-669.
[6]
D. J.Spiegelhalter, S. L.Lauritzen. Sequential updating of conditional probabilities on directed graphical structures. Networks 1990; Volume 20: pp.579-605.
[7]
M.West, J.Harrison. Springer-Verlag Inc: New York, NY; 1997.
[8]
R.Buxton. The interpretation and justification of the subjective Bayesian approach to statistical inference MR V57 14200. Br J Philos Sci 1978; Volume 29: pp.25-38.
[9]
M.Goldstein. Subjective Bayesian analysis: princiles and practice Pkg: P403-472. Bayesian Anal 2006; Volume 1 Issue 3: pp.403-420.
[10]
P. H.Garthwaite, J. B.Kadane, A.O'Hagan. Statistical methods for eliciting probability distributions. J Am Stat Assoc 2005; Volume 100 Issue 470: pp.680-701.
[11]
J. B.Kadane, L. J.Wolfson. Experiences in elicitation. Statistician 1998; Volume 47 Issue 1: pp.3-19.
[12]
A.Daneshkhah. Psychological Aspects Influencing Elicitation of Subjective Probability, Technical Report, BEEP Report, University of Sheffield UK, 2004.
[13]
M. E.Tipping, C. M.Bishop. Probabilistic principal component analysis. J R Stat Soc Ser B: Stat Methodol 1999; Volume 61: pp.611-622.
[14]
J. D.Carroll, J. J.Chang. Analysis of individual differences in multidimensional scaling via an n-way generalization of eckart-young decomposition. Psychometrika 1970; Volume 35: pp.238-319.
[15]
S. S.Schiffman, M. L.Reynolds, F. W.Young. Academic Press: New York; 1981.
[16]
L.Brieman, J.Friedman, R.Olshen, CStone. Wasdorth: Belmont, CA; 1984.
[17]
J. R.Jensen. Prentice Hall, Inc.: Englewood, NJ; 1996.
[18]
A.Strehl, J.Ghosh. Relationship-based clustering and visualization for high-dimensional data mining. INFORMS J Comput 2003; Volume 15: pp.208-230.
[19]
T.Hastie, R.Tibshirani, J.Friedman 2nd ed. Springer-Verlag: New York, NY; 2008.
[20]
D.Keim, G.Andrienko, J.-D.Fekete, C.Görg, J. o.Kohlhammer, G.Melancon. 4950, Visual Analytics: Definition, Process, and Challenges. Springer: Berlin, Heidelberg; 2008; pp.1611-3349.
[21]
I.Icke, E.Sklar. Visual Analytics: A Multifaceted Overview, Technical Report, City University of New York, 2009.
[22]
C.Bishop, M.Svensén, C.Williams. Gtm: Generative topographic mapping. Neural Comput 1998; Volume 10 Issue 1: pp.215-234.
[23]
E.Tufte. Graphics Press: Cheshire; 1983.
[24]
D. F.Swayne, D. T.Lang, A.Buja, D.Cook. GGobi: Evolving from XGobi into an extensible framework for interactive data visualization. Comput Stat Data Anal 2003; Volume 43 Issue 4: pp.423-444.
[25]
D.Cook, D. F.Swayne. Interactive and dynamic graphics for data analysis with R and GGobi. Amstat News 2007; Volume 364: pp.26-26.
[26]
D.Asimov. The grand tour: a tool for viewing multidimensional data. SIAM J Sci Stat Comput 1985; Volume 6 Issue 1: pp.128-143.
[27]
J. H.Friedman, J. W.Tukey. A projection pursuit algorithm for exploratory data analysis. IEEE Trans Comp 1974; Volume 23 Issue 9: pp.881-890.
[28]
G.Leban, B.Zupan, G.Vidmar, I.Bratko. Vizrank: Data visualization guided by machine learning. Data Mining Knowl Discov 2006; Volume 13: pp.119-136.
[29]
M.Goldstein, D.Woof. Jon Wiley and Sons Ltd: West Sussex; 2007.
[30]
J.Thomas and K.Cook,¿ed. National Visualizations and Analytics Center: Richland, WA; 2005.
[31]
M. E.Tipping, C. M.Bishop. Mixtures of probabilistic principal component analyzers. Neural Comput 1999; Volume 11: pp.443-482.
[32]
M. B.Eisen, P. T.Spellman, P. O.Brown, D.Botstein. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci 1998; Volume 95: pp.14863-14868.
[33]
M. P. S.Brown, W. N.Grundy, D.Lin, C. W.Sugnet, T. S.Furey, M.Ares, D.Haussler. Knowledge-bases analysis of microarray gene expression data by using support vector machines, vol.¿97, 2000, pp.262-267.
[34]
O.Chapelle, B.Schölkopf, A.Zien. The MIT Press: Cambridge, MA; 2006.
[35]
J.MacInnes, S.Santosa, W.Wright. Visual classification: expert knowledge guides machine learning. IEEE Comput Graph Appl 2010; Volume 30 Issue 1: pp.8-14.
[36]
C.Meek, B.Thiesson, D.Heckerman. The learning-curve sampling method applied to model-based clustering. J Mach Learn Res 2002; Volume 2: pp.397-418.
[37]
M. L. R.UCI. Us census data 1990 data set, 1990. "http://archive.ics.uci.edu/ml/datasets/US+Census+Data+".
[38]
M.-S.Oh, A. E.Raftery. Bayesian multidimensional scaling and choice of dimension. J Am Stat Assoc 2001; Volume 96: pp.1031-1044.
[39]
M.Avrie. Prentice-Hall: Englewood, NJ; 1976.
[40]
A.Endert, P.Fiaux, C.North. Semantic interaction for sensemaking: inferring analytical reasoning for model ssamring. IEEE Trans Visual Comp Graph 2012; Volume 18 Issue 12: pp.2879-2888.
[41]
X.Hu, L.Bradel, D.Maiti, L.House, C.North, S.Leman. Semantics of directly manipulating spatializations. IEEE Trans Visual Comp Graph 2013; Volume 19 Issue 12: pp.2052-2059.
[42]
C.Han, L.House, S.Leman. Expert-Guided Generative Topographic Mapping with Virusl to Parametric Interaction, Technical Report, Virginia Tech, 2014.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Statistical Analysis and Data Mining
Statistical Analysis and Data Mining  Volume 8, Issue 1
February 2015
63 pages
ISSN:1932-1864
EISSN:1932-1872
Issue’s Table of Contents

Publisher

John Wiley & Sons, Inc.

United States

Publication History

Published: 01 February 2015

Author Tags

  1. Bayesian
  2. Ssense-making
  3. data mining
  4. elicitation
  5. high-dimensional data
  6. sequential updating
  7. statistical visualization
  8. visual analytics

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media