research-article

Search log analysis of user stereotypes, information seeking behavior, and contextual evaluation

Authors:

Jaap KampsAuthors Info & Claims

IIiX '10: Proceedings of the third symposium on Information interaction in context

Pages 245 - 254

https://doi.org/10.1145/1840784.1840820

Published: 18 August 2010 Publication History

Abstract

Evaluation is needed in order to benchmark and improve systems. In information retrieval (IR), evaluation is centered around the test collection, i.e. the set of documents that systems should retrieve given the matching queries coming from users. Much of the evaluation is uniform, i.e. there is one test collection and every query is processed in the same way by a system. But does one size fit all? Queries are created by different users in different contexts. This paper presents a method to contextualize the IR evaluation using search logs. We study search log files in the archival domain, and the retrieval of archival finding aids in the popular standard Encoded Archival Description (EAD) in particular. We study various aspects of the searching behavior in the log, and use them to define particular searcher stereotypes. Focusing on two user stereotypes, namely novice and expert users, we can automatically derive queries and pseudo-relevance judgments from the interaction data in the log files. We investigate how this can be used for context-sensitive system evaluation tailored to these user stereotypes. Our findings are in line with and complement prior user studies of archival users. The results also show that satisfying the demand of expert users is harder compared to novices as experts have more challenging information seeking needs, but also that the choice of system does not influence the relative IR performance of a system between different user groups.

References

[1]

Bailey, P., Craswell, N., & Hawking, D. (2003). Engineering a multi-purpose test collection for web retrieval experiments. Inf. Process. Manage., 39 (6), 853--871.

Digital Library

[2]

Boncz, P. A., Grust, T., van Keulen, M., Manegold, S., Rittinger, J., & Teubner, J. (2006). MonetDB/XQuery: A Fast XQuery Processor Powered by a Relational Engine. In SIGMOD '06, (pp. 479--490). ACM.

Digital Library

[3]

Brand-Gruwel, S., Wopereis, I., & Vermetten, Y. (2005). Information problem solving by experts and novices: analysis of a complex cognitive skill. Computers in Human Behavior, 21 (3), 487--508.

Digital Library

[4]

Cleverdon, C. W. (1967). The Cranfield tests on index language devices. Aslib, 19, 173--192.

[5]

Duff, W. M., & Johnson, C. A. (2003). Where is the list with all the names? information-seeking behavior of genealogists. The American archivist, 66, 79--95.

[6]

Duff, W. M., & Stoyanova, P. (1998). Transforming the Crazy Quilt: Archival Displays from a User's Point of View. Archivaria, 45 (Spring), 44--79.

[7]

Dumais, S., Joachims, T., Bharat, K., & Weigend, A. (2003). SIGIR 2003 workshop report: implicit measures of user interests and preferences. SIGIR Forum, 37(2), 50--54.

Digital Library

[8]

Ellis, D. (1989). A behavioral approach to information retrieval system design. Journalof Documentation, 45 (3), 171--212.

Digital Library

[9]

Feeney, K. (1999). Retrieval of archival finding aids using world-wide-web searchengines. The American Archivist, 62(2), 206--228.

[10]

Hiemstra, D., Rode, H., van Os, R., & Flokstra, J. (2006). PF/Tijah: text search in an XML database system. In OSIR '06, (pp. 12--17).

[11]

Holscher, C., & Strube, G. (2000). Web search behavior of internet experts and newbies. Computer Networks, 33(1-6), 337--346.

Digital Library

[12]

Hutchinson, T. (1997). Strategies for Searching Online Finding Aids: A Retrieval Experiment. Archivaria, 44(Fall), 72--101.

[13]

Jansen, B. J. (2006). Search log analysis: What it is, what's been done, how to do it. Library & Information Science Research, 28(3), 407--432.

[14]

Jansen, B. J., Spink, A., Bateman, J., & Saracevic, T. (1998). Real life information retrieval: a study of user queries on the web. SIGIR Forum, 32 (1), 5--17.

Digital Library

[15]

Jansen, B. J., Spink, A., Blakely, C., & Koshman, S. (2007). Defining a session on web search engines. J. Am. Soc. Inf. Sci. Technol., 58(6), 862--871.

Digital Library

[16]

Jones, K. (1981). The cranfield tests. In K. Jones (Ed.) Information Retrieval Experiment, (pp. 256--284). Butterworth.

[17]

Jones, K., & van Rijsbergen, C. (1976). Information retrieval test collections. Journal of Documentation, 32, (pp. 59--75).

[18]

Jones, S., Cunningham, S. J., McNab, R. J., & Boddie, S. J. (2000). A transaction log analysis of a digital library. Int. J. on Digital Libraries, 3 (2), 152--169.

[19]

Kinsella, J., & Bryant, P. (1987). Online public access catalog research in the united kingdom: An overview. Library Trends, 35(4), 619--630.

[20]

Lalmas, M. (2009). XML Information Retrieval. Encycl. of Library and Information Sciences.

Digital Library

[21]

Lytle, R. H. (1980). Intellectual Access to Archives: I. Provenance and Content Indexing Methods of Subject Retrieval. AmericanArchivist, 43 (Winter), 64--75.

[22]

Peters, T. (1993). The history and development of transaction log analysis. Library Hi Tech, 42 (11), 41--66.

[23]

Pitti, D. V. (1999). Encoded Archival Description: An Introduction and Overview. D-Lib Magazine, 5(11).

[24]

Ribeiro, F. (1996). Subject Indexing and Authority Control in Archives: The Need for Subject Indexing in Archives and for an Indexing Policy Using Controlled Language. Journal of the Society of Archivists, 17(1), 27--54.

[25]

Rich, E. (1979). User modeling via stereotypes. Cognitive Science, 3 (4), 329--354.

[26]

Robertson, S. (2008). On the history of evaluation in IR. J. Inf. Sci., 34 (4), 439--456.

Digital Library

[27]

Robertson, S. E., & Hancock-Beaulieu, M. M. (1992). On the evaluation of IR systems. Inf. Process. Manage., 28(4), 457--466.

Digital Library

[28]

Saracevic, T. (1975). Relevance: a review of and a framework for the thinking on the notion in information science. Journal of the American Society for Information Science, 26 (6),321--343.

[29]

Shaw, W. M., Wood, J. B., Wood, R. E., & Tibbo, H. R. (1991). The cystic fibrosis database: content and research opportunities. Library and Information Science Research, 13, 347--366.

[30]

Tibbo, H. R. (2002). Primarily history: historians and the search for primary source materials. In JCDL '02, (pp. 1--10). New York, NY, USA: ACM.

Digital Library

[31]

Tibbo, H. R., & Meho, L. I. (2001). Finding finding aids on the worldwide web. The American Archivist,64 (1),61--77.

[32]

White, R. W., & Morris, D. (2007). Investigating the querying and browsing behavior of advanced search engine users. In SIGIR '07,(pp. 255--262).ACM.

Digital Library

[33]

Yakel, E., & Torres, D. A. (2003). AI: Archival Intelligence and UserExpertise. The American Archivist, 66 (1),51--78.

[34]

Zhang, J., & Kamps, J. (2009). Focused search in digital archives. InWISE, LNCS, (pp. 463--471).

Digital Library

[35]

Zhang, J., & Kamps, J. (2010). A search log-based approach to evaluation. In ECDL, LNCS. Springer.

Digital Library

Cited By

Trippas JGallagher LMackenzie JSerra ESpezzano F(2024)Re-evaluating the Command-and-Control Paradigm in Conversational Search InteractionsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679588(2260-2270)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679588
Pergantis MVarlamis IGiannakoulopoulos A(2022)User Evaluation and Metrics Analysis of a Prototype Web-Based Federated Search Engine for Art and Cultural HeritageInformation10.3390/info1306028513:6(285)Online publication date: 4-Jun-2022
https://doi.org/10.3390/info13060285
Hoekstra RKoolen Mvan Faassen M(2022)Vested Authorities, Emergent Brokers and User Archivists: Power and Legitimacy in Information ProvisionJournal on Computing and Cultural Heritage 10.1145/348448115:3(1-20)Online publication date: 16-Sep-2022
https://dl.acm.org/doi/10.1145/3484481
Show More Cited By

Index Terms

Search log analysis of user stereotypes, information seeking behavior, and contextual evaluation
1. Applied computing
  1. Computers in other domains
    1. Digital libraries and archives
2. Information systems
  1. Information retrieval
    1. Information retrieval query processing
  2. Information systems applications
    1. Digital libraries and archives

Recommendations

User Behaviour and Task Characteristics: A Field Study of Daily Information Behaviour
CHIIR '17: Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval

Previous studies investigating task based search often take the form of lab studies or large scale log analysis. In lab studies, users typically perform a designed task under a controlled environment, which may not reflect their natural behaviour. While ...
Contextual information search based on ontological user profile
ICCCI'10: Proceedings of the Second international conference on Computational collective intelligence: technologies and applications - Volume Part II

Internet users use the web to search for information they need. Every user has some particular interests and preferences when he/she searches information on the web. It is challenging to trace the exact interests of a user by a system to provide the ...
Evaluation of contextual information retrieval effectiveness: overview of issues and research

The increasing prominence of information arising from a wide range of sources delivered over electronic media has made traditional information retrieval systems less effective. Indeed, users are overwhelmed by the information delivered by such systems in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

IIiX '10: Proceedings of the third symposium on Information interaction in context

August 2010

408 pages

ISBN:9781450302470

DOI:10.1145/1840784

General Chair:
Nicholas J. Belkin
Rutgers University, USA
,
Program Chair:
Diane Kelly
University of North Carolina at Chapel Hill, USA

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

IEEE
SIGIR: ACM Special Interest Group on Information Retrieval
Microsoft Research: Microsoft Research
Rutgers University: Rutgers University

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 August 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

IIiX 2010

IIiX 2010: Information Interaction in Context Symposium

August 18 - 21, 2010

New Jersey, New Brunswick, USA

Acceptance Rates

Overall Acceptance Rate 21 of 45 submissions, 47%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
363
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)2

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Trippas JGallagher LMackenzie JSerra ESpezzano F(2024)Re-evaluating the Command-and-Control Paradigm in Conversational Search InteractionsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679588(2260-2270)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679588
Pergantis MVarlamis IGiannakoulopoulos A(2022)User Evaluation and Metrics Analysis of a Prototype Web-Based Federated Search Engine for Art and Cultural HeritageInformation10.3390/info1306028513:6(285)Online publication date: 4-Jun-2022
https://doi.org/10.3390/info13060285
Hoekstra RKoolen Mvan Faassen M(2022)Vested Authorities, Emergent Brokers and User Archivists: Power and Legitimacy in Information ProvisionJournal on Computing and Cultural Heritage 10.1145/348448115:3(1-20)Online publication date: 16-Sep-2022
https://dl.acm.org/doi/10.1145/3484481
Walsh DClough PHall MHopfgartner FFoster JKontonatsios G(2019)Analysis of Transaction Logs from National Museums LiverpoolDigital Libraries for Open Knowledge10.1007/978-3-030-30760-8_7(84-98)Online publication date: 30-Aug-2019
https://doi.org/10.1007/978-3-030-30760-8_7
Trace CDillon A(2012)The evolution of the finding aid in the United States: from physical to digital document genreArchival Science10.1007/s10502-012-9190-512:4(501-519)Online publication date: 24-Jul-2012
https://doi.org/10.1007/s10502-012-9190-5
Hahn J(2011)Location‐based recommendation services in library book stacksReference Services Review10.1108/0090732111118667739:4(654-674)Online publication date: 15-Nov-2011
https://doi.org/10.1108/00907321111186677

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents