research-article

Open access

Legal Statute Identification: A Case Study using State-of-the-Art Datasets and Methods

Authors:

Saptarshi GhoshAuthors Info & Claims

SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 2231 - 2240

https://doi.org/10.1145/3626772.3657879

Published: 11 July 2024 Publication History

Abstract

Legal Statute Identification (LSI) involves identifying the relevant statutes (articles of law) given the facts (evidence) of a legal case. There are several key challenges in LSI, such as (i)~usage of label (statute) semantics which can be complicated and confusing; (ii)~the input text (i.e., the facts) are very long and noisy; (iii)~the label distribution usually follows a long tail, making predictions for the rare labels challenging. Although multiple methods have been proposed to address these challenges, there has not been any comprehensive study to establish the effects of these factors on different models/approaches. In this work, we reproduce several LSI models on two popular LSI datasets and study the effect of the above-mentioned challenges. We conduct thorough experiments with transformer-based encoders such as BERT and Longformer. We further try out different combinations of these encoders with approaches devised specifically for LSI, which essentially use different mechanisms to model the statute texts to enhance fact representations. Our experiments yield several interesting insights into how the above-mentioned challenges are addressed by different models, the interplay of different encoding and statute text handling measures, and how the nature of the LSI datasets affects the model performances. Finally, we also analyze the explanability capabilities of different approaches using human-annotated rationales.

References

[1]

Iz Beltagy, Matthew E. Peters, and Arman Cohan. 2020. Longformer: The Long-Document Transformer. arxiv: 2004.05150

[2]

Ilias Chalkidis. 2023. ChatGPT may Pass the Bar Exam soon, but has a Long Way to Go for the LexGLUE benchmark. arxiv: 2304.12202

[3]

Ilias Chalkidis, Ion Androutsopoulos, and Nikolaos Aletras. 2019. Neural Legal Judgment Prediction in English. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. https://doi.org/10.18653/v1/P19--1424

[4]

Ilias Chalkidis, Manos Fergadiotis, Dimitrios Tsarapatsanis, Nikolaos Aletras, Ion Androutsopoulos, and Prodromos Malakasiotis. 2021. Paragraph-level Rationale Extraction through Regularization: A case study on European Court of Human Rights Cases. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, and Yichao Zhou (Eds.). Association for Computational Linguistics, Online, 226--241. https://doi.org/10.18653/v1/2021.naacl-main.22

[5]

Ilias Chalkidis, Abhik Jana, Dirk Hartung, Michael Bommarito, Ion Androutsopoulos, Daniel Katz, and Nikolaos Aletras. 2022. LexGLUE: A Benchmark Dataset for Legal Language Understanding in English. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). https://doi.org/10.18653/v1/2022.acl-long.297

[6]

Si Chen, Pengfei Wang, Wei Fang, Xingchen Deng, and Feng Zhang. 2019. Learning to Predict Charges for Judgment with Legal Graph. In Artificial Neural Networks and Machine Learning -- ICANN 2019: Text and Time Series: 28th International Conference on Artificial Neural Networks, Munich, Germany, September 17--19, 2019, Proceedings, Part IV. https://doi.org/10.1007/978--3-030--30490--4_20

Digital Library

[7]

Xiang Dai, Ilias Chalkidis, Sune Darkner, and Desmond Elliott. 2022. Revisiting Transformer-based Models for Long Document Classification. In Findings of the Association for Computational Linguistics: EMNLP 2022. https://doi.org/10.18653/v1/2022.findings-emnlp.534

[8]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). https://doi.org/10.18653/v1/N19--1423

[9]

Jay DeYoung, Sarthak Jain, Nazneen Fatema Rajani, Eric Lehman, Caiming Xiong, Richard Socher, and Byron C. Wallace. 2020. ERASER: A Benchmark to Evaluate Rationalized NLP Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for Computational Linguistics, Online, 4443--4458. https://doi.org/10.18653/v1/2020.acl-main.408

[10]

Rotem Dror, Gili Baumer, Segev Shlomov, and Roi Reichart. 2018. The Hitchhiker's Guide to Testing Statistical Significance in Natural Language Processing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Iryna Gurevych and Yusuke Miyao (Eds.). Association for Computational Linguistics, Melbourne, Australia, 1383--1392. https://doi.org/10.18653/v1/P18--1128

[11]

Yi Feng, Chuanyi Li, and Vincent Ng. 2022. Legal Judgment Prediction via Event Extraction with Constraints. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). https://doi.org/10.18653/v1/2022.acl-long.48

[12]

Prakhar Gupta, Matteo Pagliardini, and Martin Jaggi. 2019. Better Word Embeddings by Disentangling Contextual n-Gram Information. In NAACL-HLT (1). Association for Computational Linguistics, 933--939.

[13]

Congqing He, Li Peng, Yuquan Le, Jiawei He, and Xiangyu Zhu. 2019. SECaps: A Sequence Enhanced Capsule Model for Charge Prediction. In Artificial Neural Networks and Machine Learning -- ICANN 2019: Text and Time Series. https://doi.org/10.1007/978--3-030--30490--4_19

[14]

Zikun Hu, Xiang Li, Cunchao Tu, Zhiyuan Liu, and Maosong Sun. 2018. Few-Shot Charge Prediction with Discriminative Legal Attributes. In Proceedings of the 27th International Conference on Computational Linguistics.

[15]

Daniel Martin Katz, Michael J. Bommarito, II, and Josh Blackman. 2017. A general approach for predicting the behavior of the Supreme Court of the United States. PLOS ONE (2017). https://doi.org/10.1371/journal.pone.0174698

[16]

Yuquan Le, Congqing He, Meng Chen, Youzheng Wu, Xiaodong He, and Bowen Zhou. 2020. Learning to Predict Charges for Legal Judgment via Self-Attentive Capsule Network. ECAI (2020). https://doi.org/10.3233/FAIA200295

[17]

Yuquan Le, Yuming Zhao, Meng Chen, Zhe Quan, Xiaodong He, and Kenli Li. 2022. Legal Charge Prediction via Bilinear Attention Network. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. https://doi.org/10.1145/3511808.3557379

Digital Library

[18]

Wan-Chen Lin, Tsung-Ting Kuo, Tung-Jia Chang, Chueh-An Yen, Chao-Ju Chen, and Shou-de Lin. 2012. Exploiting Machine Learning Models for Chinese Legal Documents Labeling, Case Classification, and Sentencing Prediction. In Proceedings of the 24th Conference on Computational Linguistics and Speech Processing (ROCLING 2012).

[19]

Yi-Hung Liu, Yen-Liang Chen, and Wu-Liang Ho. 2015. Predicting associated statutes for legal problems. Information Processing & Management (2015). https://doi.org/10.1016/j.ipm.2014.07.003

[20]

Bingfeng Luo, Yansong Feng, Jianbo Xu, Xiang Zhang, and Dongyan Zhao. 2017. Learning to Predict Charges for Criminal Cases with Legal Basis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. https://doi.org/10.18653/v1/D17--1289

[21]

Shounak Paul, Pawan Goyal, and Saptarshi Ghosh. 2022. LeSICiN: A Heterogeneous Graph-Based Approach for Automatic Legal Statute Identification from Indian Legal Documents. In Proceedings of the AAAI Conference on Artificial Intelligence. https://doi.org/10.1609/aaai.v36i10.21363

[22]

Shounak Paul, Arpan Mandal, Pawan Goyal, and Saptarshi Ghosh. 2023. Pre-Trained Language Models for the Legal Domain: A Case Study on Indian Law. In Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law. https://doi.org/10.1145/3594536.3595165

Digital Library

[23]

Shaurya Vats, Atharva Zope, Somsubhra De, Anurag Sharma, Upal Bhattacharya, Shubham Nigam, Shouvik Guha, Koustav Rudra, and Kripabandhu Ghosh. 2023. LLMs -- the Good, the Bad or the Indispensable?: A Use Case on Legal Statute Prediction and Legal Judgment Prediction on Indian Court Cases. In Findings of the Association for Computational Linguistics: EMNLP 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 12451--12474. https://doi.org/10.18653/v1/2023.findings-emnlp.831

[24]

Pengfei Wang, Yu Fan, Shuzi Niu, Ze Yang, Yongfeng Zhang, and Jiafeng Guo. 2019. Hierarchical Matching Network for Crime Classification. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. https://doi.org/10.1145/3331184.3331223

Digital Library

[25]

Pengfei Wang, Ze Yang, Shuzi Niu, Yongfeng Zhang, Lei Zhang, and ShaoZhang Niu. 2018. Modeling Dynamic Pairwise Attention for Crime Classification over Legal Articles. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. https://doi.org/10.1145/3209978.3210057

Digital Library

[26]

Nuo Xu, Pinghui Wang, Long Chen, Li Pan, Xiaoyan Wang, and Junzhou Zhao. 2020. Distinguish Confusing Law Articles for Legal Judgment Prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.280

[27]

Wenmian Yang, Weijia Jia, Xiaojie Zhou, and Yutao Luo. 2019. Legal Judgment Prediction via Multi-Perspective Bi-Feedback Network. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19. https://doi.org/10.24963/ijcai.2019/567

[28]

Linan Yue, Qi Liu, Binbin Jin, Han Wu, Kai Zhang, Yanqing An, Mingyue Cheng, Biao Yin, and Dayong Wu. 2021. NeurJudge: A Circumstance-Aware Neural Framework for Legal Judgment Prediction. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. https://doi.org/10.1145/3404835.3462826

Digital Library

[29]

Han Zhang, Zhicheng Dou, Yutao Zhu, and Ji-Rong Wen. 2023. Contrastive Learning for Legal Judgment Prediction. ACM Trans. Inf. Syst. (2023). https://doi.org/10.1145/3580489

Digital Library

[30]

Haoxi Zhong, Zhipeng Guo, Cunchao Tu, Chaojun Xiao, Zhiyuan Liu, and Maosong Sun. 2018. Legal Judgment Prediction via Topological Learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. https://doi.org/10.18653/v1/D18--1390

[31]

Haoxi Zhong, Yuzhong Wang, Cunchao Tu, Tianyang Zhang, Zhiyuan Liu, and Maosong Sun. 2020. Iteratively Questioning and Answering for Interpretable Legal Judgment Prediction. In Proceedings of the AAAI Conference on Artificial Intelligence. https://doi.org/10.1609/aaai.v34i01.5479

Index Terms

Legal Statute Identification: A Case Study using State-of-the-Art Datasets and Methods
1. Applied computing
  1. Law, social and behavioral sciences
    1. Law
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction

Recommendations

VN-Legal-KG: Vietnam Legal Knowledge Graph for Legal Statute Identification on Land Law Matters
Computational Data and Social Networks
Abstract
Legal Statute Identification (LSI) is a critical task within the realm of law, involving the identification of relevant statutory laws based on the natural language descriptions found in legal documents. Traditionally, this challenge has been ...
Automatic Catchphrase Identification from Legal Court Case Documents
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

Automatically identifying catchphrases from legal court case documents is an important problem in Legal Information Retrieval, which has not been extensively studied. In this work, we propose an unsupervised approach for extraction and ranking of ...
Practical Methods for Legal Investigations: Concepts and Protocols in Civil and Criminal Cases

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2024

3164 pages

ISBN:9798400704314

DOI:10.1145/3626772

General Chairs:
Grace Hui Yang
Georgetown University, USA
,
Hongning Wang
Tsinghua University, China
,
Sam Han
The Washington Post, USA
,
Program Chairs:
Claudia Hauff
Spotify, Netherlands
,
Guido Zuccon
The University of Queensland, Australia
,
Yi Zhang
University of California Santa Cruz, USA

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 July 2024

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGIR 2024

Sponsor:

SIGIR

SIGIR 2024: The 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 14 - 18, 2024

Washington DC, USA

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
455
Total Downloads

Downloads (Last 12 months)455
Downloads (Last 6 weeks)81

Reflects downloads up to 01 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten