Enhancing Linguistic Generalization of VLA: Fine-Tuning OpenVLA via Synthetic Instruction Augmentation

Dongik Shin

arXiv:2603.16044·cs.AI·March 18, 2026

Enhancing Linguistic Generalization of VLA: Fine-Tuning OpenVLA via Synthetic Instruction Augmentation

Dongik Shin

PDF

Open Access

TL;DR

This paper improves the linguistic generalization of the OpenVLA model for embodied AI by using synthetic instruction augmentation and fine-tuning with LoRA, leading to more robust understanding of diverse commands in new environments.

Contribution

It introduces a parameter-efficient fine-tuning method using synthetic instructions generated by an LLM to enhance OpenVLA's linguistic generalization capabilities.

Findings

01

LoRA fine-tuning improves model robustness

02

Synthetic instruction augmentation enriches linguistic diversity

03

Enhanced model performance in unseen environments

Abstract

Generalization remains a core challenge in embodied AI, as robots must adapt to diverse environments. While OpenVLA represents the State-of-the-Art (SOTA) in Vision-Language-Action models by leveraging large-scale pre-training, its zero-shot performance can be limited when encountering completely new environments. This paper proposes a parameter-efficient fine-tuning strategy to enhance the linguistic generalization of OpenVLA by synthesizing a general instruction set for the Bridge Dataset V2. The paper leverages a Large Language Model (LLM) to generate a rich variety of semantically equivalent but structurally diverse commands for existing trajectories. In this experiment, Low-Rank Adaptation (LoRA) is implemented to fine-tune OpenVLA on augmented pairs, allowing the model to bridge the gap between complex natural language intent and robotic actions. Results demonstrate that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Social Robot Interaction and HRI · Domain Adaptation and Few-Shot Learning