Cognitive Load-Aware Inference: A Neuro-Symbolic Framework for Optimizing the Token Economy of Large Language Models

Yilun Zhang

arXiv:2507.00653·cs.LG·July 2, 2025

Cognitive Load-Aware Inference: A Neuro-Symbolic Framework for Optimizing the Token Economy of Large Language Models

Yilun Zhang

PDF

Open Access

TL;DR

This paper introduces a neuro-symbolic framework called Cognitive Load-Aware Inference (CLAI) that applies cognitive theories to optimize large language model inference, significantly reducing token usage while maintaining performance.

Contribution

It formalizes cognitive load metrics for LLMs and proposes two methods, CLAI-Prompt and CLAI-Tune, to improve inference efficiency based on cognitive principles.

Findings

01

Up to 45% reduction in token consumption

02

Maintains accuracy across complex reasoning tasks

03

Emergent problem decomposition ability in CLAI-Tune

Abstract

The escalating computational costs of Large Language Model (LLM) inference have become a critical barrier to their widespread and sustainable deployment. While existing optimization strategies are effective, they are predominantly based on statistical heuristics or architectural modifications, lacking a guiding cognitive theory to manage the inference process itself. This paper aims to bridge this gap by introducing a novel paradigm: the Cognitive Load-Aware Inference (CLAI) framework, which operationalizes principles from Cognitive Load Theory (CLT) and neuroscience for LLM inference. We formalize the concepts of Intrinsic Cognitive Load, Extraneous Cognitive Load, and Germane Cognitive Load into quantifiable LLM metrics ( $I C L_{LL M}$ , $E C L_{LL M}$ , and $GC L_{LL M}$ ), thereby reframing the inference process as a cognitive economics optimization problem: based on the intrinsic complexity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Multimodal Machine Learning Applications