skip to main content
10.1145/2501511.2501519acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

A process-centric data mining and visual analytic tool for exploring complex social networks

Published: 11 August 2013 Publication History

Abstract

Social scientists and observational scientists have a need to analyze complex network data sets. Examples of such exploratory tasks include: finding communities that exist in the data, comparing results from different graph mining algorithms, identifying regions of similarity or dissimilarity in the data sets, and highlighting nodes with important centrality properties. While many methods, algorithms, and visualizations exist, the capability to apply and combine them for ad-hoc visual exploration or as part of an analytic workflow process is still an open problem that needs to be addressed to help scientists, especially those without extensive programming knowledge. In this paper, we present Invenio-Workflow, a tool that supports exploratory analysis of network data by integrating workflow, querying, data mining, statistics, and visualization to enable scientific inquiry. Invenio-Workflow can be used to create custom exploration tasks, in addition to the standard task templates. After describing the features of the system, we illustrate its utility through several use cases based on networks from different domains.

References

[1]
Jgraph - open source (bsd) java graph visualization and layout component. http://www.jgraph.com/jgraph.html.
[2]
The jide docking framework - a very powerful yet easy-to-use dockable window solution. http://www.jidesoft.com/products/dock.htm.
[3]
J. Abello, F. van Ham, and N. Krishnan. ASK-graphview: A large scale graph visualization system. IEEE Transactions on Visualization and Computer Graphics, 12(5):669--676, 2006.
[4]
E. Adar. Guess: a language and interface for graph exploration. In International Conference on Human Factors in Computing Systems, 2006.
[5]
Amira. Software platform for 3d and 4d data visualization, processing, and analysis. http://www.amira.com/.
[6]
M. Bastian, S. Heymann, and M. Jacomy. Gephi: An open source software for exploring and manipulating networks. In International AAAI Conference on Weblogs and Social Media, 2009.
[7]
V. Batagelj and A. Mrvar. Pajek-analysis and visualization of large networks. In P. Mutzel, M. Junger, and S. Leipert, editors, Graph Drawing, volume 2265 of Lecture Notes in Computer Science. Springer, 2002.
[8]
M. Bilgic, L. Licamele, L. Getoor, and B. Shneiderman. D-dupe: An interactive tool for entity resolution in social networks. In IEEE Symposium on Visual Analytics Science and Technology, 2006.
[9]
U. Brandes and D. Wagner. Visone - analysis and visualization of social networks. In Graph Drawing Software, 2003.
[10]
J. Demšar, B. Zupan, G. Leban, and T. Curk. Orange: from experimental machine learning to interactive data mining. In European Conference on Principles and Practice of Knowledge Discovery in Databases, 2004.
[11]
D. Dimitrov, L. Singh, and J. Mann. Comparison queries for uncertain graphs. In (to appear) 24th International Conference on Database and Expert Systems Applications, DEXA, 2013.
[12]
M. Freire, C. Plaisant, B. Shneiderman, and J. Golbeck. ManyNets: an interface for multiple network analysis and visualization. In International Conference on Human Factors in Computing Systems, 2010.
[13]
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: An update. SIGKDD Explorations, 11:10--18, 2009.
[14]
D. Hansen, B. Shneiderman, and M. A. Smith. Analyzing Social Media Networks with NodeXL: Insights from a Connected World. Morgan Kaufmann Publishers, 2011.
[15]
J. Heer, S. K. Card, and J. A. Landay. prefuse: a toolkit for interactive information visualization. In International Conference on Human Factors in Computing Systems, 2005.
[16]
N. Henry, J. Fekete, and M. J. McGuffin. Nodetrix: A hybrid visualization of social networks. IEEE Transactions on Visualization and Computer Graphics, 13:1302--1309, 2007.
[17]
JUNG. Java universal network/graph framework. http://jung.sourceforge.net/.
[18]
I. Jusufi, Y. Dingjie, and A. Kerren. The network lens: Interactive exploration of multivariate networks using visual filtering. In Conference on Information Visualisation, 2010.
[19]
S. Kandel, A. Paepcke, J. Hellerstein, and J. Heer. Enterprise data analysis and visualization: An interview study. In IEEE Visual Analytics Science & Technology (VAST), 2012.
[20]
H. Kang, L. Getoor, and L. Singh. C-group: A visual analytic tool for pairwise analysis of dynamic group membership. In IEEE Symposium on Visual Analytics Science and Technology, 2007.
[21]
LINQS. Machine learning research group @ umd. Available from http://www.cs.umd.edu/sen/lbc-proj/LBC.html.
[22]
Z. Liu, B. Lee, S. Kandula, and R. Mahajan. Netclinic: Interactive visualization to enhance automated fault diagnosis in enterprise networks. In IEEE Symposium on Visual Analytics Science and Technology, 2010.
[23]
J. Mann and S. B. R. Team. Shark bay dolphin project. http://www.monkeymiadolphins.org, 2011.
[24]
I. Mierswa, M. Wurst, R. Klinkenberg, M. Scholz, and T. Euler. Yale: Rapid prototyping for complex data mining tasks. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD), New York, NY, USA, August 2006. ACM.
[25]
G. Namata. Gaia (graph alignment, identification and analysis), a software library and tool for analyzing and running machine learning algorithms over graph data. https://github.com/linqs/GAIA.
[26]
NetMiner. Netminer - social network analysis software. Available from http://www.netminer.com.
[27]
J. O'Madadhain, D. Fisher, P. Smyth, S. White, and Y. Boey. Analysis and visualization of network data using jung. Journal of Statistical Software, 10:1--35, 2005.
[28]
Prefuse. The prefuse visualization toolkit. http://prefuse.org.
[29]
R. Software environment for statistical computing and graphics. http://www.r-project.org/.
[30]
R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2012.
[31]
E. Santos, D. Koop, H. T. Vo, E. W. Anderson, J. Freire, and C. Silva. Using workflow medleys to streamline exploratory tasks. In Proceedings of the 21st International Conference on Scientific and Statistical Database Management, SSDBM, 2009.
[32]
P. Shannon, A. Markiel, O. Ozier, N. S. Baliga, J. T. Wang, D. Ramage, N. Amin, B. Schwikowski, and T. Ideker. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Research, 13(11):2498--2504, 2003.
[33]
H. Sharara, A. Sopan, G. Namata, L. Getoor, and L. Singh. G-pare: A visual analytic tool for comparative analysis of uncertain graphs. In IEEE VAST, pages 61--70, 2011.
[34]
J. Stasko, C. Gorg, and Z. Liu. Jigsaw: Supporting investigative analysis through interactive visualization. Information Visualization, 7:118--132, 2008.
[35]
Voreen. Volume rendering engine for interactive visualization of volumetric data sets. http://www.voreen.org/.
[36]
W. W. Zachary. An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33:452--473, 1977.

Cited By

View all
  1. A process-centric data mining and visual analytic tool for exploring complex social networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IDEA '13: Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics
    August 2013
    104 pages
    ISBN:9781450323291
    DOI:10.1145/2501511
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 August 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    KDD' 13
    Sponsor:

    Acceptance Rates

    IDEA '13 Paper Acceptance Rate 11 of 25 submissions, 44%;
    Overall Acceptance Rate 11 of 25 submissions, 44%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 22 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media