Unsupervised approaches for measuring textual similarity between legal court case reports

Arpan Mandal; Kripabandhu Ghosh; Saptarshi Ghosh; Sekhar Mandal

Download from

dx.doi.org

More download options

Unsupervised approaches for measuring textual similarity between legal court case reports

Arpan Mandal, Kripabandhu Ghosh, Saptarshi Ghosh & Sekhar Mandal

Artificial Intelligence and Law 29 (3):417-451 (2021) Copy BIBT_EX

Abstract

In the domain of legal information retrieval, an important challenge is to compute similarity between two legal documents. Precedents play an important role in The Common Law system, where lawyers need to frequently refer to relevant prior cases. Measuring document similarity is one of the most crucial aspects of any document retrieval system which decides the speed, scalability and accuracy of the system. Text-based and network-based methods for computing similarity among case reports have already been proposed in prior works but not without a few pitfalls. Since legal citation networks are generally highly disconnected, network based metrics are not suited for them. Till date, only a few text-based and predominant embedding based methods have been employed, for instance, TF-IDF based approaches, Word2Vec and Doc2Vec based approaches. We investigate the performance of 56 different methodologies for computing textual similarity across court case statements when applied on a dataset of Indian Supreme Court Cases. Among the 56 different methods, thirty are adaptations of existing methods and twenty-six are our proposed methods. The methods studied include models such as BERT and Law2Vec. It is observed that the more traditional methods that rely on a bag-of-words representation performs better than the more advanced context-aware methods for computing document-level similarity. Finally we nominate, via empirical validation, five of our best performing methods as appropriate for measuring similarity between case reports. Among these five, two are adaptations of existing methods and the other three are our proposed methods.

Cite

Plain text

BibTeX

Formatted text

Zotero

EndNote

Reference Manager

RefWorks

Options

Edit

Mark as duplicate

Find it on Scholar

Request removal from index

Revision history

Keywords

Add keywords

Reprint years

DOI

10.1007/s10506-020-09280-2

Other Versions

No versions found

My notes

Analytics

Added to PP
2021-01-05

Downloads
76 (#275,366)

6 months
5 (#1,037,427)

Historical graph of downloads

How can I increase my downloads?

Citations of this work

Using machine learning to create a repository of judgments concerning a new practice area: a case study in animal protection law.Joe Watson, Guy Aglionby & Samuel March - 2023 - Artificial Intelligence and Law 31 (2):293-324.

From PARIS to LE-PARIS: toward patent response automation with recommender systems and collaborative large language models.Jung-Mei Chu, Hao-Cheng Lo, Jieh Hsiang & Chun-Chieh Cho - forthcoming - Artificial Intelligence and Law:1-27.

A novel network-based paragraph filtering technique for legal document similarity analysis.Mayur Makawana & Rupa G. Mehta - forthcoming - Artificial Intelligence and Law:1-23.

Add more citations

References found in this work

Encoded summarization: summarizing documents into continuous vector space for legal case retrieval.Vu Tran, Minh Le Nguyen, Satoshi Tojo & Ken Satoh - 2020 - Artificial Intelligence and Law 28 (4):441-467.

Feature-rich part-of-speech tagging with a cyclic dependency network.Christopher Manning - manuscript

Add more references

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

Unsupervised approaches for measuring textual similarity between legal court case reports

Abstract

Categories

Keywords

Reprint years

DOI

Other Versions

Links

PhilArchive

External links

Through your library

My notes

Similar books and articles

Analytics

Citations of this work

References found in this work