# ZASCA-sum: A dataset of the South Africa supreme courts of appeal judgments and media summaries for legal documents summarization research

**Authors:** Idris Abdulmumin, Vukosi Marivate

PMC · DOI: 10.1016/j.dib.2025.111567 · Data in Brief · 2025-04-19

## TL;DR

ZASCA-Sum is a new dataset of South African court judgments and media summaries to help improve legal document summarization.

## Contribution

The novel ZASCA-Sum dataset provides curated legal judgments and summaries for legal document summarization research in South Africa.

## Key findings

- The dataset includes 4171 judgments with 2118 summary pairs collected from the South Africa Supreme Court of Appeal.
- The dataset is split into training, validation, and test sets for use in supervised, semi-supervised, and unsupervised summarization tasks.
- The dataset supports localization of English-centric models to South African dialects.

## Abstract

This paper presents ZASCA-Sum, a novel dataset comprising judgments from the South Africa Supreme Court of Appeal and their manually curated media summaries. The dataset, collected from the court's official website, includes 4171 judgments, of which 2118 have summary pairs. The judgments and summaries have been extracted and prepared to support legal document summarization tasks across supervised, semi-supervised, and unsupervised settings. This paper provides a detailed description of the dataset, covering the data collection process, timeline, processing, and potential applications in the field. We provide the token-count distribution and analysis of the judgments and summaries that can be accommodated off-the-shelf by current summarization models with the largest input token size. The dataset, split into training, validation, and test sets, is made publicly available to encourage research in legal summarization. In addition to document summarization, researchers can use this data to localize English-centric models to support the South African dialect.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12098150/full.md

## Figures

17 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12098150/full.md

## References

7 references — full list in the complete paper: https://tomesphere.com/paper/PMC12098150/full.md

---
Source: https://tomesphere.com/paper/PMC12098150