Large language models as tax attorneys: a case study in legal capabilities emergence

John J. Nay; David Karamardian; Sarah B. Lawsky; Wenting Tao; Meghana Bhat; Raghav Jain; Aaron Travis Lee; Jonathan H. Choi; Jungo Kasai

PMC · DOI:10.1098/rsta.2023.0159·February 26, 2024

Large language models as tax attorneys: a case study in legal capabilities emergence

John J. Nay, David Karamardian, Sarah B. Lawsky, Wenting Tao, Meghana Bhat, Raghav Jain, Aaron Travis Lee, Jonathan H. Choi, Jungo Kasai

TL;DR

This paper studies how large language models can perform tax law analysis, showing improved performance with newer models and better accuracy when given legal context and examples.

Contribution

The study introduces a novel approach to evaluating legal reasoning in LLMs using tax law and automated validation pipelines.

Findings

01

LLM performance in tax law improves with each new model release.

02

Few-shot prompting and legal context significantly enhance model accuracy.

03

LLMs can perform at high accuracy but still fall short of expert tax lawyer levels.

Abstract

Better understanding of Large Language Models' (LLMs) legal analysis abilities can contribute to improving the efficiency of legal services, governing artificial intelligence and leveraging LLMs to identify inconsistencies in law. This paper explores LLM capabilities in applying tax law. We choose this area of law because it has a structure that allows us to set up automated validation pipelines across thousands of examples, requires logical reasoning and maths skills, and enables us to test LLM capabilities in a manner relevant to real-world economic lives of citizens and companies. Our experiments demonstrate emerging legal understanding capabilities, with improved performance in each subsequent OpenAI model release. We experiment with retrieving and using the relevant legal authority to assess the impact of providing additional legal context to LLMs. Few-shot prompting, presenting…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Genes1

GPT

Proteins1

Species1

Homo sapiens(human · species)

Chemicals1

CoT

Diseases2

LLM hallucinations LLMs

Figures4

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law · Ethics and Social Impacts of AI · Law, AI, and Intellectual Property