No Argument Left Behind: Overlapping Chunks for Faster Processing of Arbitrarily Long Legal Texts
Israel Fama, B\'arbara Bueno, Alexandre Alcoforado, Thomas Palmeira, Ferraz, Arnold Moya, Anna Helena Reali Costa

TL;DR
This paper presents uBERT, a hybrid Transformer-RNN model designed to efficiently analyze arbitrarily long legal texts, addressing the slow processing issues in large-scale judiciary systems.
Contribution
The paper introduces uBERT, a novel hybrid model that combines Transformer and RNN architectures to handle long legal texts more efficiently than existing models.
Findings
uBERT outperforms BERT+LSTM with overlapping input.
uBERT is significantly faster than ULMFiT for long documents.
The approach maintains reasonable computational overhead.
Abstract
In a context where the Brazilian judiciary system, the largest in the world, faces a crisis due to the slow processing of millions of cases, it becomes imperative to develop efficient methods for analyzing legal texts. We introduce uBERT, a hybrid model that combines Transformer and Recurrent Neural Network architectures to effectively handle long legal texts. Our approach processes the full text regardless of its length while maintaining reasonable computational overhead. Our experiments demonstrate that uBERT achieves superior performance compared to BERT+LSTM when overlapping input is used and is significantly faster than ULMFiT for processing long legal documents.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Comparative and International Law Studies · Legal Education and Practice Innovations
MethodsAttention Is All You Need · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Temporal Activation Regularization · Weight Tying · Slanted Triangular Learning Rates · Dense Connections · Label Smoothing · Byte Pair Encoding
