# Toward Realistic Autonomous Driving Dataset Augmentation: A Real–Virtual Fusion Approach with Inconsistency Mitigation

**Authors:** Sukwoo Jung, Myeongseop Kim, Jean Oh, Jonghwa Kim, Kyung-Taek Lee

PMC · DOI: 10.3390/s26030987 · Sensors (Basel, Switzerland) · 2026-02-03

## TL;DR

This paper introduces a method to create realistic autonomous driving datasets by combining real and virtual data, reducing costs and risks.

## Contribution

The novel real–virtual fusion framework with inconsistency mitigation techniques improves dataset realism and model generalization.

## Key findings

- The real–virtual fusion approach significantly reduces the reality gap in autonomous driving datasets.
- Incorporating virtual objects with illumination matching enhances visual consistency in augmented images.

## Abstract

Autonomous driving systems rely on vast and diverse datasets for robust object recognition. However, acquiring real-world data, especially for rare and hazardous scenarios, is prohibitively expensive and risky. While purely synthetic data offers flexibility, it often suffers from a significant reality gap due to discrepancies in visual fidelity and physics. To address these challenges, this paper proposes a novel real–virtual fusion framework for efficiently generating highly realistic augmented image datasets for autonomous driving. Our methodology leverages real-world driving data from South Korea’s K-City, synchronizing it with a digital twin environment in Morai Sim (v24.R2) through a robust look-up table and fine-tuned localization approach. We then seamlessly inject diverse virtual objects (e.g., pedestrians, vehicles, traffic lights) into real image backgrounds. A critical contribution is our focus on inconsistency mitigation, employing advanced techniques such as illumination matching during virtual object injection to minimize visual discrepancies. We evaluate the proposed approach through experiments. Our results show that this real–virtual fusion strategy significantly bridges the reality gap, providing a cost-effective and safe solution for enriching autonomous driving datasets and improving the generalization capabilities of perception models.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12899562/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12899562/full.md

## References

34 references — full list in the complete paper: https://tomesphere.com/paper/PMC12899562/full.md

---
Source: https://tomesphere.com/paper/PMC12899562