skip to main content
10.1145/3459637.3481994acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

CLC-RS: A Chinese Legal Case Retrieval System with Masked Language Ranking

Published: 30 October 2021 Publication History

Abstract

With the ever-increasing size of legal cases in China, relevant legal case retrieval given a user query has attracted considerable attention. Conventional keyword-based retrieval systems look for matching cases that contain one or more words specified by the user. However, keyword search is sharply focused on finding the exact terms specified in the query, making the retrieval systems miss many relevant documents. In addition, it is difficult for new users to identify appropriate keywords for accurate legal case retrieval. In this paper, we develop a novel Chinese legal case retrieval system (called CLC-RS), which improves the quality of semantic search with natural language queries in the legal domain. CLC-RS performs legal case retrieval in a two-stage fashion. First, we employ a classic token-based ranking method to efficiently reduce the solution space, returning a subset of candidate legal cases. Then, we deploy a novel masked language ranking model to re-rank the candidate legal cases. The experimental results show that the proposed system is both efficient and effective, providing a practical information retrieval (IR) system for retrieving Chinese legal cases. The web site for the developed CLC-RS system is available at: https://www.delilegal.com/.

References

[1]
David C Blair and Melvin E Maron. 1985. An evaluation of retrieval effectiveness for a full-text document-retrieval system. Commun. ACM 28, 3 (1985), 289--299.
[2]
Nick Craswell. 2009. Mean Reciprocal Rank. Encyclopedia of database systems 1703 (2009).
[3]
Anup Anand Deshmukh and Udhav Sethi. 2020. IR-BERT: Leveraging BERT for Semantic Search in Background Linking for News Articles. arXiv preprint arXiv:2007.12603 (2020).
[4]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. NAACL (2019).
[5]
Peter Jackson, Khalid Al-Kofahi, Alex Tyrrell, and Arun Vachher. 2003. Information extraction from case law and retrieval of prior cases. Artificial Intelligence 150, 1--2 (2003), 239--290.
[6]
Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) 20, 4 (2002), 422--446.
[7]
Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, and Qun Liu. 2019. Tinybert: Distilling bert for natural language understanding. arXiv preprint arXiv:1909.10351 (2019).
[8]
Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song, and Rabab Ward. 2016. Deep sentence embedding using long short-term memory networks: Analysis and application to information retrieval. IEEE/ACM Transactions on Audio, Speech, and Language Processing 24, 4 (2016), 694--707.
[9]
Stephen Robertson. 2008. A new interpretation of average precision. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. 689--690.
[10]
Stephen Robertson and Hugo Zaragoza. 2009. The probabilistic relevance framework: BM25 and beyond. Now Publishers Inc.
[11]
Manavalan Saravanan, Balaraman Ravindran, and Shivani Raman. 2009. Improving legal information retrieval using an ontological framework. Artificial Intelligence and Law 17, 2 (2009), 101--124.
[12]
Yunqiu Shao, Jiaxin Mao, Yiqun Liu, Weizhi Ma, Ken Satoh, Min Zhang, and Shaoping Ma. 2020. BERT-PLI: Modeling Paragraph-Level Interactions for Legal Case Retrieval. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. 3501--3507.
[13]
Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Grégoire Mesnil. 2014. A latent semantic model with convolutional-pooling structure for information retrieval. In Proceedings of the 23rd ACM international conference on conference on information and knowledge management. 101--110.
[14]
Marc Van Opijnen and Cristiana Santos. 2017. On the concept of relevance in legal information retrieval. Artificial Intelligence and Law 25, 1 (2017), 65--87.

Cited By

View all

Index Terms

  1. CLC-RS: A Chinese Legal Case Retrieval System with Masked Language Ranking

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
      October 2021
      4966 pages
      ISBN:9781450384469
      DOI:10.1145/3459637
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 30 October 2021

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. legal case retrieval
      2. masked language model
      3. ranking model

      Qualifiers

      • Short-paper

      Conference

      CIKM '21
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)21
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 10 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media