skip to main content
10.1145/1840784.1840820acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiiixConference Proceedingsconference-collections
research-article

Search log analysis of user stereotypes, information seeking behavior, and contextual evaluation

Published: 18 August 2010 Publication History

Abstract

Evaluation is needed in order to benchmark and improve systems. In information retrieval (IR), evaluation is centered around the test collection, i.e. the set of documents that systems should retrieve given the matching queries coming from users. Much of the evaluation is uniform, i.e. there is one test collection and every query is processed in the same way by a system. But does one size fit all? Queries are created by different users in different contexts. This paper presents a method to contextualize the IR evaluation using search logs. We study search log files in the archival domain, and the retrieval of archival finding aids in the popular standard Encoded Archival Description (EAD) in particular. We study various aspects of the searching behavior in the log, and use them to define particular searcher stereotypes. Focusing on two user stereotypes, namely novice and expert users, we can automatically derive queries and pseudo-relevance judgments from the interaction data in the log files. We investigate how this can be used for context-sensitive system evaluation tailored to these user stereotypes. Our findings are in line with and complement prior user studies of archival users. The results also show that satisfying the demand of expert users is harder compared to novices as experts have more challenging information seeking needs, but also that the choice of system does not influence the relative IR performance of a system between different user groups.

References

[1]
Bailey, P., Craswell, N., & Hawking, D. (2003). Engineering a multi-purpose test collection for web retrieval experiments. Inf. Process. Manage., 39 (6), 853--871.
[2]
Boncz, P. A., Grust, T., van Keulen, M., Manegold, S., Rittinger, J., & Teubner, J. (2006). MonetDB/XQuery: A Fast XQuery Processor Powered by a Relational Engine. In SIGMOD '06, (pp. 479--490). ACM.
[3]
Brand-Gruwel, S., Wopereis, I., & Vermetten, Y. (2005). Information problem solving by experts and novices: analysis of a complex cognitive skill. Computers in Human Behavior, 21 (3), 487--508.
[4]
Cleverdon, C. W. (1967). The Cranfield tests on index language devices. Aslib, 19, 173--192.
[5]
Duff, W. M., & Johnson, C. A. (2003). Where is the list with all the names? information-seeking behavior of genealogists. The American archivist, 66, 79--95.
[6]
Duff, W. M., & Stoyanova, P. (1998). Transforming the Crazy Quilt: Archival Displays from a User's Point of View. Archivaria, 45 (Spring), 44--79.
[7]
Dumais, S., Joachims, T., Bharat, K., & Weigend, A. (2003). SIGIR 2003 workshop report: implicit measures of user interests and preferences. SIGIR Forum, 37(2), 50--54.
[8]
Ellis, D. (1989). A behavioral approach to information retrieval system design. Journalof Documentation, 45 (3), 171--212.
[9]
Feeney, K. (1999). Retrieval of archival finding aids using world-wide-web searchengines. The American Archivist, 62(2), 206--228.
[10]
Hiemstra, D., Rode, H., van Os, R., & Flokstra, J. (2006). PF/Tijah: text search in an XML database system. In OSIR '06, (pp. 12--17).
[11]
Holscher, C., & Strube, G. (2000). Web search behavior of internet experts and newbies. Computer Networks, 33(1-6), 337--346.
[12]
Hutchinson, T. (1997). Strategies for Searching Online Finding Aids: A Retrieval Experiment. Archivaria, 44(Fall), 72--101.
[13]
Jansen, B. J. (2006). Search log analysis: What it is, what's been done, how to do it. Library & Information Science Research, 28(3), 407--432.
[14]
Jansen, B. J., Spink, A., Bateman, J., & Saracevic, T. (1998). Real life information retrieval: a study of user queries on the web. SIGIR Forum, 32 (1), 5--17.
[15]
Jansen, B. J., Spink, A., Blakely, C., & Koshman, S. (2007). Defining a session on web search engines. J. Am. Soc. Inf. Sci. Technol., 58(6), 862--871.
[16]
Jones, K. (1981). The cranfield tests. In K. Jones (Ed.) Information Retrieval Experiment, (pp. 256--284). Butterworth.
[17]
Jones, K., & van Rijsbergen, C. (1976). Information retrieval test collections. Journal of Documentation, 32, (pp. 59--75).
[18]
Jones, S., Cunningham, S. J., McNab, R. J., & Boddie, S. J. (2000). A transaction log analysis of a digital library. Int. J. on Digital Libraries, 3 (2), 152--169.
[19]
Kinsella, J., & Bryant, P. (1987). Online public access catalog research in the united kingdom: An overview. Library Trends, 35(4), 619--630.
[20]
Lalmas, M. (2009). XML Information Retrieval. Encycl. of Library and Information Sciences.
[21]
Lytle, R. H. (1980). Intellectual Access to Archives: I. Provenance and Content Indexing Methods of Subject Retrieval. AmericanArchivist, 43 (Winter), 64--75.
[22]
Peters, T. (1993). The history and development of transaction log analysis. Library Hi Tech, 42 (11), 41--66.
[23]
Pitti, D. V. (1999). Encoded Archival Description: An Introduction and Overview. D-Lib Magazine, 5(11).
[24]
Ribeiro, F. (1996). Subject Indexing and Authority Control in Archives: The Need for Subject Indexing in Archives and for an Indexing Policy Using Controlled Language. Journal of the Society of Archivists, 17(1), 27--54.
[25]
Rich, E. (1979). User modeling via stereotypes. Cognitive Science, 3 (4), 329--354.
[26]
Robertson, S. (2008). On the history of evaluation in IR. J. Inf. Sci., 34 (4), 439--456.
[27]
Robertson, S. E., & Hancock-Beaulieu, M. M. (1992). On the evaluation of IR systems. Inf. Process. Manage., 28(4), 457--466.
[28]
Saracevic, T. (1975). Relevance: a review of and a framework for the thinking on the notion in information science. Journal of the American Society for Information Science, 26 (6),321--343.
[29]
Shaw, W. M., Wood, J. B., Wood, R. E., & Tibbo, H. R. (1991). The cystic fibrosis database: content and research opportunities. Library and Information Science Research, 13, 347--366.
[30]
Tibbo, H. R. (2002). Primarily history: historians and the search for primary source materials. In JCDL '02, (pp. 1--10). New York, NY, USA: ACM.
[31]
Tibbo, H. R., & Meho, L. I. (2001). Finding finding aids on the worldwide web. The American Archivist,64 (1),61--77.
[32]
White, R. W., & Morris, D. (2007). Investigating the querying and browsing behavior of advanced search engine users. In SIGIR '07,(pp. 255--262).ACM.
[33]
Yakel, E., & Torres, D. A. (2003). AI: Archival Intelligence and UserExpertise. The American Archivist, 66 (1),51--78.
[34]
Zhang, J., & Kamps, J. (2009). Focused search in digital archives. InWISE, LNCS, (pp. 463--471).
[35]
Zhang, J., & Kamps, J. (2010). A search log-based approach to evaluation. In ECDL, LNCS. Springer.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
IIiX '10: Proceedings of the third symposium on Information interaction in context
August 2010
408 pages
ISBN:9781450302470
DOI:10.1145/1840784
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 August 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. archives
  2. evaluation
  3. search log analysis
  4. user stereotypes

Qualifiers

  • Research-article

Conference

IIiX 2010
IIiX 2010: Information Interaction in Context Symposium
August 18 - 21, 2010
New Jersey, New Brunswick, USA

Acceptance Rates

Overall Acceptance Rate 21 of 45 submissions, 47%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)2
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media