Abstract
The appropriate understanding and fast processing of lengthy legal documents are computationally challenging problems. Designing efficient automatic summarization techniques can potentially be the key to deal with such issues. Extractive summarization is one of the most popular approaches for forming summaries out of such lengthy documents, via the process of summary-relevant sentence selection. An efficient application of this approach involves appropriate scoring of sentences, which helps in the identification of more informative and essential sentences from the document. In this work, a novel sentence scoring approach DCESumm is proposed which consists of supervised sentence-level summary relevance prediction, as well as unsupervised clustering-based document-level score enhancement. Experimental results on two legal document summarization datasets, BillSum and Forum of Information Retrieval Evaluation (FIRE), reveal that the proposed approach can achieve significant improvements over the current state-of-the-art approaches. More specifically it achieves ROUGE metric F1-score improvements of (1−6)% and (6−12)% for the BillSum and FIRE test sets respectively. Such impressive summarization results suggest the usefulness of the proposed approach in finding the gist of a lengthy legal document, thereby providing crucial assistance to legal practitioners.