skip to main content
10.1145/2856767.2856777acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
research-article

What Belongs Together Comes Together: Activity-centric Document Clustering for Information Work

Published: 07 March 2016 Publication History

Abstract

Multitasking and interruptions in information work make frequent activity switches necessary. Individuals need to recall and restore earlier states of work which generally involves retrieval of information objects. To avoid resulting tooling time an activity-centric organization of information objects has been proposed. For each activity a collection with related information objects (like documents, websites etc.) is created to improve information access and serve as a memory aid. While the manual maintenance of such information collections is a tedious task and becomes an interruption on its own, the automatic maintenance of such collections using activity mining is promising. Activity mining utilizes interaction histories to extract unique activities based on the stream of interaction with information objects. For activity mining, existing work shows varying success in limited study setups. In this paper, we present a method for activity mining to generate activity-centric information object collections automatically from interaction histories. The technique is a hybrid approach considering all information types used in previous work -- activity stream and accessed content related information. Method performance is evaluated based on interaction histories collected during real work data from eight information workers collected over several weeks. For the dataset our hybrid approach shows on average a performance of 0.53 ARI up to 0.77 ARI, outperforming single metric-based approaches.

References

[1]
Bailey, B. P., and Konstan, J. A. On the need for attention-aware systems: Measuring effects of interruption on task performance, error rate, and affective state. Computers in Human Behavior 22 (2006), 685--708.
[2]
Bellotti, V. Managing Activities with TV-ACTA : TaskVista and ActivityCentered Task Assistant. In Personal Information Management Workshop, ACM Press (2006), 8--11.
[3]
Berry, M., Dumais, S., and O'Brien, G. Using linear algebra for intelligent information retrieval. SIAM review, December (1995).
[4]
Blei, D., Ng, A., and Jordan, M. Latent dirichlet allocation. the Journal of machine Learning research 3 (2003), 993--1022.
[5]
Brdiczka, O., Su, N., and Begole, J. Temporal task footprinting: identifying routine tasks by their temporal patterns. In Proceedings of the 15th international conference on Intelligent user interfaces, ACM Press (2010).
[6]
Fowlkes, E. B., and Mallows, C. L. A Method for Comparing Two Hierarchical Clusterings. Journal of the American Statistical Association 78 (1983), 553--569.
[7]
González, V., and Mark, G. Constant, constant, multi-tasking craziness: managing multiple working spheres. In Proceedings of the SIGCHI conference on Human factors in computing systems, vol. 6, ACM Press (2004), 113--120.
[8]
Hastie, T., Tibshirani, R., and Friedman, J. The elements of statistical learning. 2009.
[9]
Hubert, L., and Arabie, P. Comparing partitions. Journal of Classification 2, 1 (1985), 193--218.
[10]
Iqbal, S. T., and Horvitz, E. Disruption and recovery of computing tasks: Field study, analysis, and directions. In Proc. CHI 2007, ACM (2007), 677--686.
[11]
Jeuris, S., Houben, S., and Bardram, J. Laevo: A temporal desktop interface for integrated knowledge work. In Proc. UIST 2014, ACM (2014), 679--688.
[12]
Kaptelinin, V. UMEA: translating interaction histories into project contexts. In Proceedings of the SIGCHI conference on Human factors in computing systems, no. 5, ACM Press (2003), 353--360.
[13]
Kennedy, J. Particle swarm optimization. Encyclopedia of Machine Learning 4 (2010), 1942--1948.
[14]
Landauer, T. K., Foltz, P. W., and Laham, D. An introduction to latent semantic analysis. Discourse Processes 25, 2-3 (Jan. 1998), 259--284.
[15]
Manning, C. D., Raghavan, P., and Schütze, H. Introduction to Information Retrieval, vol. 1. 2008.
[16]
Oliver, N., Smith, G., Thakkar, C., and Surendran, A. C. Swish: Semantic analysis of window titles and switching history. In Proc. IUI 2006, ACM (2006), 194--201.
[17]
Rajaraman, A., and Ullman, J. D. Data mining. In Mining of Massive Datasets. Cambridge University Press, 2011, 1--17. Cambridge Books Online.
[18]
Rand, W. M. Objective Criteria for the Evaluation of Clustering Methods. Journal of the American Statistical Association 66 (1971), 846--850.
[19]
Rattenbury, T. An activity based approach to context-aware computing. PhD thesis, 2008.
[20]
Reinhardt, W., Schmidt, B., Sloep, P., and Drachsler, H. Knowledge Worker Roles and Actions - Results of Two Empirical Studies. Knowledge and Process Management 18, 3 (2011), 150--174.
[21]
Schmidt, B. Information Work Support Based on Activity Data. PhD thesis, TU Darmstadt, Mai 2013.
[22]
Schmidt, B., Kastl, J., Stoitsev, T., and Mühlhäauser, M. Hierarchical Task Instance Mining in Interaction Histories. In Proceedings of the 29th annual international conference on Design of communication (SIGDOC), ACM Press (2011).
[23]
Schweizer, I., and Schmidt, B. Kraken.me: Multi-device user tracking suite. In Proc. UbiComp 2014, ACM (2014), 853--862.
[24]
Shen, J., Li, L., Dietterich, T. G., and Herlocker, J. L. A hybrid learning system for recognizing user tasks from desktop activities and email messages. In Proc. IUI 2006, ACM (2006), 86--92.
[25]
SINTEF. Big data, for better or worse: 90% of world's data generated over last two years. ScienceDaily (May 2013).
[26]
Smith, G., Bausdich, P., Robertson, G., Czerwinski, M., Meyers, B., Robbins, D., and Andrews, D. Groupbar: The taskbar evolved. In Proc. OZCHI 2003 (January 2003).
[27]
Vinh, N. X., Epps, J., and Bailey, J. Information theoretic measures for clusterings comparison: Is a correction for chance necessary? In Proc. ICML 2009, ACM (2009), 1073--1080.
[28]
Wagner, S., and Wagner, D. Comparing clusterings: an overview. No. 001907. 2007.
[29]
Wilson, E. O. Sociobiology: The new synthesis. Harvard University Press, 2000.
[30]
Witten, I. H., Frank, E., and Hall, M. A. Data Mining: Practical Machine Learning Tools and Techniques (Google eBook). 2011.

Cited By

View all

Index Terms

  1. What Belongs Together Comes Together: Activity-centric Document Clustering for Information Work

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IUI '16: Proceedings of the 21st International Conference on Intelligent User Interfaces
    March 2016
    446 pages
    ISBN:9781450341370
    DOI:10.1145/2856767
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 March 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. activity mining
    2. document organization
    3. information work

    Qualifiers

    • Research-article

    Funding Sources

    • HA Hessen Agentur GmbH Land Hessen Deutschland

    Conference

    IUI'16
    Sponsor:

    Acceptance Rates

    IUI '16 Paper Acceptance Rate 49 of 194 submissions, 25%;
    Overall Acceptance Rate 746 of 2,811 submissions, 27%

    Upcoming Conference

    IUI '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)9
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 06 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media