No abstract available.
Proceeding Downloads
Human-in-the-Loop Data Analysis: A Personal Perspective
In the past few years human-in-the-loop data analysis (HILDA) has received significant growing attention. Most HILDA works have focused on concrete problems. In this paper I take a step back and discuss several "big picture" questions regarding HILDA. ...
ViDeTTe Interactive Notebooks
Interactive notebooks allow the use of popular languages, such as python, for composing data analytics projects. The interface they provide, enables data scientists to import data, analyze them and compose the results into easily readable report-like ...
Towards a Unified Representation of Insight in Human-in-the-Loop Analytics: A User Study
Understanding what insights people draw from data visualizations is critical for human-in-the loop analytics systems to facilitate mixed-initiative analysis. In this paper we present results from a large user study on insights extracted from commonly ...
Evaluating Visual Data Analysis Systems: A Discussion Report
- Leilani Battle,
- Marco Angelini,
- Carsten Binnig,
- Tiziana Catarci,
- Philipp Eichmann,
- Jean-Daniel Fekete,
- Giuseppe Santucci,
- Michael Sedlmair,
- Wesley Willett
Visual data analysis is a key tool for helping people to make sense of and interact with massive data sets. However, existing evaluation methods (e.g., database benchmarks, individual user studies) fail to capture the key points that make systems for ...
DIVE: A Mixed-Initiative System Supporting Integrated Data Exploration Workflows
Generating knowledge from data is an increasingly important activity. This process of data exploration consists of multiple tasks: data ingestion, visualization, statistical analysis, and storytelling. Though these tasks are complementary, analysts ...
Querying Videos Using DNN Generated Labels
Massive amounts of videos are generated for entertainment, security, and science, powered by a growing supply of user-produced video hosting services. Unfortunately, searching for videos is difficult due to the lack of content annotations. Recent ...
Optimally Leveraging Density and Locality for Exploratory Browsing and Sampling
Exploratory data analysis often involves repeatedly browsing a small sample of records that satisfy certain predicates. We propose a fast query evaluation engine, called NeedleTail, aimed at letting analysts browse a subset of the query result on large ...
Source Selection Languages: A Usability Evaluation
When looking to obtain insights from data, and given numerous possible data sources, there are certain quality criteria that retrieved data from selected sources should exhibit so as to be most fit-for-purpose. An effective source selection algorithm ...
Provenance for Interactive Visualizations
We highlight the connections between data provenance and interactive visualizations. To do so, we first incrementally add interactions to a visualization and show how these interactions are readily expressible in terms of provenance. We then describe ...
Beaver: Towards a Declarative Schema Mapping
Schema mapping is used to transform data to a desired schema from data sources with different schemas. Manually writing complete schema mapping specifications requires a deep understanding of the source and target schemas, which can be burdensome for ...
SchemaDrill: Interactive Semi-Structured Schema Design
Ad-hoc data models like JSON make it easy to evolve schemas and to multiplex different data-types into a single stream. This flexibility makes JSON great for generating data, but also makes it much harder to query, ingest into a database, and index. In ...
What Type of a Matcher Are You?: Coordination of Human and Algorithmic Matchers
In this work we explore relationships between human and algorithmic schema matchers. We provide a novel approach to similar schema matchers termed coordinated matchers and use it to predict future human matching choices. We show throughout a ...
Draining the Data Swamp: A Similarity-based Approach
While hierarchical namespaces such as filesystems and repositories have long been used to organize data, the rapid increase in data production places increasing strain on users who wish to make use of the data. So called "data lakes" embrace the storage ...
Cited By
Index Terms
- Proceedings of the Workshop on Human-In-the-Loop Data Analytics
WSDM'15 Workshop Summary / Scalable Data Analytics: Theory and Applications
WSDM '15: Proceedings of the Eighth ACM International Conference on Web Search and Data MiningThe SDA workshop at WSDM 2015 is the fifth International Workshop on Scalable Data Analytics, following the previous four workshops of SDA respectively held at IEEE Big Data 2013, PAKDD 2014, IEEE Big Data 2014, and IEEE ICDM 2014. This series of ...