Generating Hierarchical JSON Representations of Scientific Sentences Using LLMs

Satya Sri Rajiteswari Nimmagadda; Ethan Young; Niladri Sengupta; Ananya Jana; Aniruddha Maiti

arXiv:2603.23532·cs.CL·March 26, 2026

Generating Hierarchical JSON Representations of Scientific Sentences Using LLMs

Satya Sri Rajiteswari Nimmagadda, Ethan Young, Niladri Sengupta, Ananya Jana, Aniruddha Maiti

PDF

Open Access

TL;DR

This paper explores the use of fine-tuned lightweight LLMs with a novel structural loss to generate hierarchical JSON representations of scientific sentences, effectively preserving their meaning for reconstruction.

Contribution

It introduces a new structural loss function for fine-tuning LLMs to produce hierarchical JSON structures from scientific sentences, enhancing information retention.

Findings

01

Hierarchical JSON formats effectively retain scientific sentence information.

02

Reconstructed sentences show high semantic and lexical similarity to originals.

03

The approach demonstrates potential for structured scientific text representation.

Abstract

This paper investigates whether structured representations can preserve the meaning of scientific sentences. To test this, a lightweight LLM is fine-tuned using a novel structural loss function to generate hierarchical JSON structures from sentences collected from scientific articles. These JSONs are then used by a generative model to reconstruct the original text. Comparing the original and reconstructed sentences using semantic and lexical similarity we show that hierarchical formats are capable of retaining information of scientific texts effectively.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Topic Modeling · Natural Language Processing Techniques