Belief Filtering for Epistemic Control in Linguistic State Space
Sebastian Dumbrava

TL;DR
This paper proposes belief filtering within a linguistic state space to control and regulate AI cognitive states, enhancing interpretability and safety through structured semantic interventions.
Contribution
It introduces a novel belief filtering mechanism embedded in a linguistic semantic framework for improved epistemic control of artificial agents.
Findings
Belief filtering enables content-aware regulation of internal cognitive states.
The approach enhances interpretability and modularity of agent control.
Potential applications in AI safety and alignment are demonstrated.
Abstract
We examine belief filtering as a mechanism for the epistemic control of artificial agents, focusing on the regulation of internal cognitive states represented as linguistic expressions. This mechanism is developed within the Semantic Manifold framework, where belief states are dynamic, structured ensembles of natural language fragments. Belief filters act as content-aware operations on these fragments across various cognitive transitions. This paper illustrates how the inherent interpretability and modularity of such a linguistically-grounded cognitive architecture directly enable belief filtering, offering a principled approach to agent regulation. The study highlights the potential for enhancing AI safety and alignment through structured interventions in an agent's internal semantic space and points to new directions for architecturally embedded cognitive governance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, Reasoning, and Knowledge · Multi-Agent Systems and Negotiation · Language and cultural evolution
