AI-Assisted Authoring for Transparent, Data-Driven Documents
Alfonso Piscitelli, Cristina David, Mattia De Rosa, Ali Mohammed, Federico Nanni, Jacob Pake, Roly Perera, Jessy Sodimu, Chenyiqiu Zheng

TL;DR
This paper presents a novel system for creating transparent, data-driven scholarly articles that allow interactive exploration of underlying data, using an LLM-based tool to generate data queries embedded in the text.
Contribution
It introduces a new authoring tool that leverages language models and data provenance to automatically generate data queries for transparent scientific documents.
Findings
GPT-4 can synthesize compatible data queries for text fragments.
The system effectively identifies data-driven text segments.
Evaluation on SciGen shows promising results.
Abstract
We introduce _transparent documents_, interactive web-based scholarly articles which allow readers to explore the relationship to the underlying data by hovering over fragments of text, and present an LLM-based tool for authoring transparent documents, building on recent developments in data provenance for general-purpose programming languages. As a target platform, our implementation uses Fluid, an open source programming language with a provenance-tracking runtime. Our agent-based tool supports a human author during the creation of transparent documents, identifying fragments of text which can be computed from data, such as numerical values selected from records or computed by aggregations like sum and mean, comparatives and superlatives like _better than_ and _largest_, trend-adjectives like _growing_, and similar quantitative or semi-quantitative phrases, and then attempts to…
Peer Reviews
Decision·Submitted to ICLR 2026
The proposed system addresses an important problem in science, in particular with increasing publication numbers in many fields. The proof of concept shows that this could potentially help readers understand scientific papers better and check claims more easily.
The proposed system still seems to be in early stages with respect to what can be verified, which limits its usefulness in practice. First, as far as I understand, it is limited to single papers and does not allow to check claims a paper makes about results presented in another paper. This is where a system like the proposed would be most useful -- while directly linking to evidence in the same paper is useful, manually checking this information is not nearly as laborious as checking something i
- This framing connects two currently active research directions: LLM-based code synthesis and data provenance systems. By automatically translating natural language descriptions into executable queries, the authors aim to make scientific communication more transparent and auditable. The idea is conceptually novel and resonates with the broader goal of improving reproducibility in AI-assisted research writing. - In terms of clarity and presentation, the paper is exceptionally well structured an
- The main limitation of this work lies in its restricted and preliminary evaluation. The experiments are conducted on a small subset of the SciGen dataset, without a clear description of selection criteria. Although the paper provides useful category-level statistics, it does not analyze statistical variance beyond reporting standard deviation or assess generalization to unseen writing styles or datasets. - This paper describing a “manual validation step,” but lacks a true user-centered evalua
+ This paper primarily visualizes the data credibility of a research article through the collaborative interaction between two agents, forming a complete interactive system for "data credibility" that transparently reveals the authenticity of the paper's conclusions. + The issues addressed by the paper include multiple categories, such as the reliability of data related to proportions, maximum and minimum values, and it even enables transparency analysis for ranking comparisons and generalized
+ The system presented in this paper is more engineering-oriented, with relatively limited academic research value. + The experimental design lacks a comprehensive justification. Without comparisons to other baseline methods, it is difficult to determine whether this system represents the optimal solution (though, as a pioneering work, more in-depth ablation studies could be considered). Overall, it can be regarded as a well-executed engineering paper.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Data Visualization and Analytics · Mathematics, Computing, and Information Processing
