Neural-Driven Image Editing

Pengfei Zhou; Jie Xia; Xiaopeng Peng; Wangbo Zhao; Zilong Ye; Zekai Li; Suorong Yang; Jiadong Pan; Yuanxiang Chen; Ziqiao Wang; Kai Wang; Qian Zheng; Hao Jin; Xiaojun Chang; Gang Pan; Shurong Dong; Kaipeng Zhang; Yang You

arXiv:2507.05397·cs.CV·January 12, 2026

Neural-Driven Image Editing

Pengfei Zhou, Jie Xia, Xiaopeng Peng, Wangbo Zhao, Zilong Ye, Zekai Li, Suorong Yang, Jiadong Pan, Yuanxiang Chen, Ziqiao Wang, Kai Wang, Qian Zheng, Hao Jin, Xiaojun Chang, Gang Pan, Shurong Dong, Kaipeng Zhang, Yang You

PDF

Open Access 1 Repo 2 Datasets

TL;DR

LoongX introduces a novel multimodal neurophysiological signal-based system for hands-free image editing, leveraging advanced diffusion models and neural signals to enable accessible, intuitive creative control.

Contribution

It presents the first comprehensive neural-driven image editing framework combining EEG, fNIRS, PPG, and head motion signals with diffusion models and contrastive learning.

Findings

01

Achieves comparable performance to text-driven methods.

02

Outperforms text-driven methods when combining neural signals with speech.

03

Demonstrates the potential for accessible, neural-driven image editing.

Abstract

Traditional image editing typically relies on manual prompting, making it labor-intensive and inaccessible to individuals with limited motor control or language abilities. Leveraging recent advances in brain-computer interfaces (BCIs) and generative models, we propose LoongX, a hands-free image editing approach driven by multimodal neurophysiological signals. LoongX utilizes state-of-the-art diffusion models trained on a comprehensive dataset of 23,928 image editing pairs, each paired with synchronized electroencephalography (EEG), functional near-infrared spectroscopy (fNIRS), photoplethysmography (PPG), and head motion signals that capture user intent. To effectively address the heterogeneity of these signals, LoongX integrates two key modules. The cross-scale state space (CS3) module encodes informative modality-specific features. The dynamic gated fusion (DGF) module further…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

LanceZPF/loongx
pytorch

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEEG and Brain-Computer Interfaces · Multimodal Machine Learning Applications · Face Recognition and Perception

MethodsContrastive Learning · ALIGN · Diffusion