Sigma: The Key for Vision-Language-Action Models toward Telepathic Alignment

Libo Wang

arXiv:2512.00783·cs.LG·January 23, 2026

Sigma: The Key for Vision-Language-Action Models toward Telepathic Alignment

Libo Wang

PDF

Open Access

TL;DR

This paper introduces Sigma, a vision-language-action model that achieves telepathic alignment between perception and action through semantic understanding and associative reasoning, without retraining the base model.

Contribution

The work presents a novel VLA architecture and training methodology enabling semantic alignment and intention-driven control in vision-language-action models.

Findings

01

Sigma reduces control MSE across multiple scales

02

Maintains stability of telepathy norm and semantic-text alignment

03

Demonstrates reproducible semantic alignment without retraining base model

Abstract

To address a fundamental limitation in cognitive systems, namely the absence of a time-updatable mediating thought space between semantics and continuous control, this work constructs and trains a vision-language-action model termed Sigma, deployed on a single RTX 4090. The model is built upon the open-source pi0.5_base backbone, with the svla_so101_pickplace dataset preprocessed into a structured training corpus. An independently designed VLA architecture is introduced to integrate deep semantic understanding with associative reasoning, enabling telepathic-style alignment between perception and action. Training proceeds through iterative optimization of data preprocessing, LoRA-based fine-tuning, and inference-stage adapter design. Evaluation is conducted using offline closed-loop replay, comparing Sigma against the untuned pi0.5_base under identical data conditions. Experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Ferroelectric and Negative Capacitance Devices