Scaling Open-Weight Large Language Models for Hydropower Regulatory Information Extraction: A Systematic Analysis

Hong-Jun Yoon; Faisal Ashraf; Thomas A. Ruggles; and Debjani Singh

arXiv:2511.11821·cs.CL·November 18, 2025

Scaling Open-Weight Large Language Models for Hydropower Regulatory Information Extraction: A Systematic Analysis

Hong-Jun Yoon, Faisal Ashraf, Thomas A. Ruggles, and Debjani Singh

PDF

Open Access

TL;DR

This paper systematically evaluates open-weight large language models for hydropower regulatory information extraction, revealing key performance thresholds and resource trade-offs to guide practical deployment.

Contribution

It provides the first comprehensive analysis of open-weight model performance in regulatory info extraction, identifying a critical 14B parameter threshold and systematic hallucination patterns.

Findings

01

Models above 14B parameters significantly improve validation success.

02

Consumer-deployable models reach 64 ext{%} F1 with proper validation.

03

Large models approach 77 ext{%} F1 but need enterprise infrastructure.

Abstract

Information extraction from regulatory documents using large language models presents critical trade-offs between performance and computational resources. We evaluated seven open-weight models (0.6B-70B parameters) on hydropower licensing documentation to provide empirical deployment guidance. Our analysis identified a pronounced 14B parameter threshold where validation methods transition from ineffective (F1 $<$ 0.15) to viable (F1 = 0.64). Consumer-deployable models achieve 64\% F1 through appropriate validation, while smaller models plateau at 51\%. Large-scale models approach 77\% F1 but require enterprise infrastructure. We identified systematic hallucination patterns where perfect recall indicates extraction failure rather than success in smaller models. Our findings establish the first comprehensive resource-performance mapping for open-weight information extraction in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law · Topic Modeling · Computational and Text Analysis Methods