AI-Facilitated Analysis of Abstracts and Conclusions: Flagging Unsubstantiated Claims and Ambiguous Pronouns
Evgeny Markhasin

TL;DR
This paper develops and evaluates structured prompts for large language models to analyze scholarly texts, focusing on detecting unsubstantiated claims and ambiguous pronouns, revealing model-specific strengths and limitations.
Contribution
It introduces a novel structured prompting approach for complex semantic and linguistic analysis in academic texts and systematically evaluates model performance across different contexts.
Findings
Gemini Pro 2.5 Pro successfully identified unsubstantiated noun phrases (95%)
ChatGPT failed to detect unsubstantiated adjectival modifiers (0%)
Both models performed well in linguistic analysis with full context (80-90%)
Abstract
We present and evaluate a suite of proof-of-concept (PoC), structured workflow prompts designed to elicit human-like hierarchical reasoning while guiding Large Language Models (LLMs) in the high-level semantic and linguistic analysis of scholarly manuscripts. The prompts target two non-trivial analytical tasks within academic summaries (abstracts and conclusions): identifying unsubstantiated claims (informational integrity) and flagging semantically confusing ambiguous pronoun references (linguistic clarity). We conducted a systematic, multi-run evaluation on two frontier models (Gemini Pro 2.5 Pro and ChatGPT Plus o3) under varied context conditions. Our results for the informational integrity task reveal a significant divergence in model performance: while both models successfully identified an unsubstantiated head of a noun phrase (95% success), ChatGPT consistently failed (0%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law
