# Impact Analysis of the Market Penetration Rate of Connected Vehicles and the Failure Rate of Roadside Equipment on Data Accuracy

**Authors:** Fengping Zhan

PMC · DOI: 10.3390/s26020686 · Sensors (Basel, Switzerland) · 2026-01-20

## TL;DR

This paper studies how connected vehicle penetration and roadside equipment failure affect data accuracy, showing that data fusion improves accuracy more than fixing missing data.

## Contribution

Proposes a SAGA-based RSE deployment method and compares data fusion with missing data imputation for traffic data accuracy.

## Key findings

- SAGA outperforms SA and GA in optimizing RSE locations for data accuracy.
- Multi-source data fusion is more effective than missing data imputation when MPR exceeds 15% or failure rate exceeds 40%.
- Fused data is less affected by sensor failure rates compared to single-source data.

## Abstract

What are the main findings?
The optimal deployment method solved by the Simulated Annealing Genetic Algorithm (SAGA) outperforms the SA algorithm and GA, which are superior to the uniform method and the hotspot method in optimizing RSE locations for improving data accuracy.The accuracy of single-source data can be improved along with the increase in CV MPR but decreases with the increase in sensor failure rate. The fused data is less affected by the failure rates. Multi-source data fusion is much more effective in improving data accuracy than missing data imputation. When the MPR is higher than 15% or the failure rate exceeds 40%, it is recommended to adopt data fusion rather than repairing missing data.

The optimal deployment method solved by the Simulated Annealing Genetic Algorithm (SAGA) outperforms the SA algorithm and GA, which are superior to the uniform method and the hotspot method in optimizing RSE locations for improving data accuracy.

The accuracy of single-source data can be improved along with the increase in CV MPR but decreases with the increase in sensor failure rate. The fused data is less affected by the failure rates. Multi-source data fusion is much more effective in improving data accuracy than missing data imputation. When the MPR is higher than 15% or the failure rate exceeds 40%, it is recommended to adopt data fusion rather than repairing missing data.

What are the implications of the main findings?
It provides a method that is more suitable to address the equipment optimization issue at the road segment level and can better balance the spatial selection fairness of sensor locations.The findings shed light on the trade-offs and benefits associated with improving RSE deployment and promoting CV development.

It provides a method that is more suitable to address the equipment optimization issue at the road segment level and can better balance the spatial selection fairness of sensor locations.

The findings shed light on the trade-offs and benefits associated with improving RSE deployment and promoting CV development.

Data quality, involving the accuracy, completeness and reliability of data, is of great significance for the operation and management of road traffic. As the two significant factors that affect data accuracy, the market penetration rate (MPR) of CVs and the failure rate of roadside equipment (RSE) were considered in the heterogeneity traffic flow comprising human-driven vehicles and CVs. An optimal deployment method solved by SAGA was proposed to optimize the locations of RSE. A rigid nearest neighbor (RNN) algorithm and a soft nearest neighbor (SNN) algorithm were addressed to handle the missing data caused by sensor failure. Additionally, the BPNN algorithm was adopted to fuse RSE data and CV data. Case analysis results show that the proposed optimal deployment method is superior to the uniform and the hotspot methods. Data accuracy can reach 95% and 98% when the MPR is 15% and 60%, respectively. It decreases with the increase in sensor failure rate for single-source data, but not for the fused data. The performance of the SNN algorithm is better than the RNN algorithm in fixing single-source missing data. However, multi-source data fusion, especially with the high-precision data, is much more effective in improving data accuracy than missing data imputation.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12846086/full.md

## Figures

26 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12846086/full.md

## References

63 references — full list in the complete paper: https://tomesphere.com/paper/PMC12846086/full.md

---
Source: https://tomesphere.com/paper/PMC12846086