DALM: A Domain-Algebraic Language Model via Three-Phase Structured Generation
Chao Li

TL;DR
DALM introduces a structured, three-phase generation framework for language models that enforces algebraic constraints to improve domain-specific knowledge handling and prevent cross-domain contamination.
Contribution
It proposes a novel three-phase generation process using algebraic structures, replacing traditional token-based decoding for better domain knowledge management.
Findings
DALM confines generation within domain fibers, preventing cross-domain contamination.
The framework enables multi-perspective answers within a domain-indexed space.
DALM is instantiated with CDC knowledge system for crystal library evaluation.
Abstract
Large language models compress heterogeneous knowledge into a single parameter space, allowing facts from different domains to interfere during generation. We propose DALM, a Domain-Algebraic Language Model that replaces unconstrained token generation with structured denoising over a domain lattice. DALM follows a three-phase generation path: it first resolves domain uncertainty, then relation uncertainty, and finally concept uncertainty, so each stage operates under explicit algebraic constraints. The framework requires only three ingredients: a lattice of domains with computable meet, join, and implication; a typing function over relations that controls inheritance across domains; and a fiber partition that localizes knowledge to domain-specific subsets. Given these ingredients, DALM yields a three-phase encoder-decoder architecture in which generation is confined to a domain fiber,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
