Towards Meaningful Maps of Polish Case Law
Michal Jungiewicz, Micha{\l} {\L}opuszy\'nski

TL;DR
This paper evaluates the effectiveness of PCA and t-SNE for visualizing Polish case law documents, finding t-SNE offers more interpretable maps that reveal hidden topical structures, aiding legal text analysis.
Contribution
It compares PCA and t-SNE for legal document visualization and demonstrates t-SNE's superior interpretability and utility in uncovering topical structures in Polish case law.
Findings
t-SNE produces more interpretable visualizations than PCA.
t-SNE reveals hidden topical structures related to keywords.
t-SNE can enhance legal database search and browsing.
Abstract
In this work, we analyze the utility of two dimensional document maps for exploratory analysis of Polish case law. We start by comparing two methods of generating such visualizations. First is based on linear principal component analysis (PCA). Second makes use of the modern nonlinear t-Distributed Stochastic Neighbor Embedding method (t-SNE). We apply both PCA and t-SNE to a corpus of judgments from different courts in Poland. It emerges that t-SNE provides better, more interpretable results than PCA. As a next test, we apply t-SNE to randomly selected sample of common court judgments corresponding to different keywords. We show that t-SNE, in this case, reveals hidden topical structure of the documents related to keyword,,pension". In conclusion, we find that the t-SNE method could be a promising tool to facilitate the exploitative analysis of legal texts, e.g., by complementing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Comparative and International Law Studies · Legal Education and Practice Innovations
MethodsPrincipal Components Analysis
