Leveraging Weighted Syntactic and Semantic Context Assessment Summary (wSSAS) Towards Text Categorization Using LLMs

Shreeya Verma Kathuria; Nitin Mayande; Sharookh Daruwalla; Nitin Joglekar; Charles Weber

arXiv:2604.12049·cs.CL·April 15, 2026

Leveraging Weighted Syntactic and Semantic Context Assessment Summary (wSSAS) Towards Text Categorization Using LLMs

Shreeya Verma Kathuria, Nitin Mayande, Sharookh Daruwalla, Nitin Joglekar, Charles Weber

PDF

TL;DR

This paper introduces wSSAS, a deterministic framework that enhances LLM-based text categorization by organizing data hierarchically and prioritizing semantic features, leading to improved accuracy and reproducibility.

Contribution

The paper presents a novel deterministic method, wSSAS, combining hierarchical organization and SNR-based feature prioritization to improve large-scale text categorization with LLMs.

Findings

01

wSSAS improves clustering integrity across datasets

02

It significantly reduces categorization entropy

03

The framework enhances reproducibility of LLM summaries

Abstract

The use of Large Language Models (LLMs) for reliable, enterprise-grade analytics such as text categorization is often hindered by the stochastic nature of attention mechanisms and sensitivity to noise that compromise their analytical precision and reproducibility. To address these technical frictions, this paper introduces the Weighted Syntactic and Semantic Context Assessment Summary (wSSAS), a deterministic framework designed to enforce data integrity on large-scale, chaotic datasets. We propose a two-phased validation framework that first organizes raw text into a hierarchical classification structure containing Themes, Stories, and Clusters. It then leverages a Signal-to-Noise Ratio (SNR) to prioritize high-value semantic features, ensuring the model's attention remains focused on the most representative data points. By incorporating this scoring mechanism into a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.