skip to main content
review-article

Lucene4IR: Developing Information Retrieval Evaluation Resources using Lucene

Published: 14 February 2017 Publication History

Abstract

The workshop and hackathon on developing Information Retrieval Evaluation Resources using Lucene (L4IR) was held on the 8th and 9th of September, 2016 at the University of Strathclyde in Glasgow, UK and funded by the ESF Elias Network. The event featured three main elements: (i) a series of keynote and invited talks on industry, teaching and evaluation; (ii) planning, coding and hacking where a number of groups created modules and infrastructure to use Lucene to undertake TREC based evaluations; and (iii) a number of breakout groups discussing challenges, opportunities and problems in bridging the divide between academia and industry, and how we can use Lucene for teaching and learning Information Retrieval (IR). The event was composed of a mix and blend of academics, experts and students wanting to learn, share and create evaluation resources for the community. The hacking was intense and the discussions lively creating the basis of many useful tools but also raising numerous issues. It was clear that by adopting and contributing to most widely used and supported Open Source IR toolkit, there were many benefits for academics, students, researchers, developers and practitioners - providing a basis for stronger evaluation practices, increased reproducibility, more efficient knowledge transfer, greater collaboration between academia and industry, and shared teaching and training resources.

References

[1]
Arguello, J., Crane, M., Diaz, F., Lin, J., and Trotman, A. Report on the sigir 2015 workshop on reproducibility, inexplicability, and generalizability of results (rigor). SIGIR Forum 49, 2 (Jan. 2016), 107--116.
[2]
Dowie, D., and Azzopardi, L. Re-leashed! The PuppyIR Framework for Developing Information Services for Children, Adults and Dogs. 2013, pp. 824--827.
[3]
Elasticsearch. https://www.elastic.co/products/elasticsearch.
[4]
Fernández-Luna, J. M., Huete, J. F., Rodríguez-Cano, J. C., and Rodríguez- Hernández, M. Teaching and learning information retrieval based on a visual and interactive tool: sulair. In 4Th International Conference on Education and New Learning Technologies (EDULEARN) (2012), pp. 6634--6642.
[5]
Lucene. https://lucene.apache.org.
[6]
Lv, Y., and Zhai, C. When documents are very long, bm25 fails! In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (2011), SIGIR '11, pp. 1103--1104.
[7]
Macdonald, C., McCreadie, R., Santos, R. L., and Ounis, I. From puppy to maturity: Experiences in developing terrier. Proc. of OSIR at SIGIR (2012), 60--63.
[8]
Melucci, M. Information retrieval. Metodi e modelli per i motori di ricerca. Informatica: Nuova serie. Franco Angeli, 2013.
[9]
Metzler, D., and Kurland, O. Experimental methods for information retrieval. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (2012), SIGIR '12, pp. 1185--1186.
[10]
Ogilvie, P., and Callan, J. P. Experiments using the lemur toolkit. In TREC (2001), vol. 10, pp. 103--108.
[11]
Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., and Johnson, D. Terrier information retrieval platform. In Proceedings of the 27th European Conference on Advances in Information Retrieval Research (Berlin, Heidelberg, 2005), ECIR'05, Springer-Verlag, pp. 517--519.
[12]
PyLucene. http://lucene.apache.org/pylucene/.
[13]
Strohman, T., Metzler, D., Turtle, H., and Croft, W. B. Indri: a languagemodel based search engine for complex queries. Tech. rep., in Proceedings of the InternationalConference on Intelligent Analysis, 2005.
[14]
Zobel, J., Williams, H., Scholer, F., Yiannis, J., and Hein, S. The zettair search engine. Search Engine Group, RMIT University, Melbourne, Australia (2004).

Cited By

View all
  • (2023)The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web ArchivesProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591890(2848-2860)Online publication date: 19-Jul-2023
  • (2023)Web APIs: Features, Issues, and Expectations – A Large-Scale Empirical Study of Web APIs From Two Publicly Accessible Registries Using Stack Overflow and a User SurveyIEEE Transactions on Software Engineering10.1109/TSE.2022.315476949:2(498-528)Online publication date: 1-Feb-2023
  • (2023)Exploitation and Merge of Information Sources for Public Procurement ImprovementMachine Learning and Principles and Practice of Knowledge Discovery in Databases10.1007/978-3-031-23618-1_6(89-102)Online publication date: 31-Jan-2023
  • Show More Cited By

Comments

Information & Contributors

Information

Published In

cover image ACM SIGIR Forum
ACM SIGIR Forum  Volume 50, Issue 2
December 2016
99 pages
ISSN:0163-5840
DOI:10.1145/3053408
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 February 2017
Published in SIGIR Volume 50, Issue 2

Check for updates

Qualifiers

  • Review-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)1
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web ArchivesProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591890(2848-2860)Online publication date: 19-Jul-2023
  • (2023)Web APIs: Features, Issues, and Expectations – A Large-Scale Empirical Study of Web APIs From Two Publicly Accessible Registries Using Stack Overflow and a User SurveyIEEE Transactions on Software Engineering10.1109/TSE.2022.315476949:2(498-528)Online publication date: 1-Feb-2023
  • (2023)Exploitation and Merge of Information Sources for Public Procurement ImprovementMachine Learning and Principles and Practice of Knowledge Discovery in Databases10.1007/978-3-031-23618-1_6(89-102)Online publication date: 31-Jan-2023
  • (2022)MMH-index: Enhancing Apache Lucene with High-Performance Multi-Modal Indexing and SearchingProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3548768(7279-7289)Online publication date: 10-Oct-2022
  • (2021)Nursing Perspectives on The Impacts of COVID-19: A Social Media Analytics Approach (Preprint)JMIR Formative Research10.2196/31358Online publication date: 22-Jun-2021
  • (2021)Cross-language code search using static and dynamic analysesProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3468538(205-217)Online publication date: 20-Aug-2021
  • (2020)Declarative Experimentation in Information Retrieval using PyTerrierProceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval10.1145/3409256.3409829(161-168)Online publication date: 14-Sep-2020
  • (2020)JASSjr: The Minimalistic BM25 Search Engine for Teaching and Learning Information RetrievalProceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3397271.3401413(2185-2188)Online publication date: 25-Jul-2020
  • (2020)From MAXSCORE to Block-Max Wand: The Story of How Lucene Significantly Improved Query Evaluation PerformanceAdvances in Information Retrieval10.1007/978-3-030-45442-5_3(20-27)Online publication date: 8-Apr-2020
  • (2020)Toward a Recommendation System: Proposition of a New Model to Measure Competences Using Dimensionality ReductionSmart Applications and Data Analysis10.1007/978-3-030-45183-7_10(133-145)Online publication date: 4-Jun-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media