Robust Localization and Tracking of VRUs with Radar and Ultra-Wideband Sensors for Traffic Safety
Mouhamed Aghiad Raslan, Martin Schmidhammer, Ibrahim Rashdan, Fabian de Ponte Müller, Tobias Uhlich, Andreas Becker

TL;DR
This paper introduces a system that combines radar and UWB sensors to accurately track vulnerable road users in urban areas, even in poor visibility or blind spots.
Contribution
The novel integration of radar and UWB sensors with a hybrid Kalman filter improves VRU tracking in complex urban environments.
Findings
Sensor fusion of radar and UWB reduces tracking uncertainties in urban traffic scenarios.
The system performs reliably in adverse weather and occluded areas where optical sensors fail.
The hybrid Kalman filter approach enables continuous and accurate tracking of VRUs and vehicles.
Abstract
What are the main findings? The paper presents a novel approach to enhancing Vulnerable Road User (VRU) protection by integrating radar sensors and a widespread network of Ultra-Wideband (UWB) nodes through sensor fusion, in order to detect and track VRUs in urban environments.The experimental results demonstrate that the fusion of radar and UWB measurements reduces tracking uncertainties and improves the accuracy of VRU tracking, particularly in areas with blind spots or occlusions. The paper presents a novel approach to enhancing Vulnerable Road User (VRU) protection by integrating radar sensors and a widespread network of Ultra-Wideband (UWB) nodes through sensor fusion, in order to detect and track VRUs in urban environments. The experimental results demonstrate that the fusion of radar and UWB measurements reduces tracking uncertainties and improves the accuracy of VRU tracking,…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9- —German Federal Ministry for Digital and Transport (BMDV)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndoor and Outdoor Localization Technologies · Traffic Prediction and Management Techniques · Target Tracking and Data Fusion in Sensor Networks
1. Introduction
Vulnerable Road Users (VRUs), including pedestrians, cyclists, and motorcyclists, comprise a significant portion of traffic fatalities worldwide. In 2021, VRUs accounted for nearly half of all road fatalities within the European Union, underscoring the critical need for targeted interventions to increase their safety at intersections and other high-risk areas [1]. As urbanization and traffic density increase, the risk to VRUs, particularly at intersections, where vehicle paths frequently cross with those of pedestrians and cyclists, has become a pressing concern [2]. While essential, traditional road safety measures like traffic signals and road markings often fall short in complex scenarios, such as when vehicles perform right turning maneuvers or when VRUs enter a driver’s blind spot.
While Vehicle-to-VRU (V2VRU) communication can provide 360° awareness, it requires VRUs to carry active communication devices [3,4]. To avoid this dependency, modern vehicles are increasingly equipped with advanced onboard perception systems utilizing LiDAR, cameras, and radar. However, vehicle-centric systems are fundamentally limited by line-of-sight (LoS) occlusions as VRUs may be obscured by parked cars or buildings. Alternatively, infrastructure-based perception systems offer a more comprehensive approach to VRU protection by leveraging fixed sensors positioned at intersections and other high-risk areas [5]. These systems typically employ cameras mounted on infrastructure, providing a broader view of the environment from multiple angles. The data collected is processed to detect VRUs and potential hazards, with critical information transmitted to nearby vehicles via Cooperative Intelligent Transport Systems (C-ITSs) in the form of Cooperative Perception Messages (CPMs). While infrastructure-side cameras can mitigate the limitations of vehicle-based systems, they face challenges such as sensitivity to lighting and weather conditions, as well as privacy concerns. The need to comply with the General Data Protection Regulation (GDPR) and other privacy regulations often necessitates the anonymization of video streams and other data, which can complicate the implementation and operation of these systems.
To overcome both optical and privacy limitations, recent work has highlighted the role of Radio Frequency (RF)-based roadside sensing systems for VRU protection [6], including calibration, fusion strategies, synchronization, and trajectory-level safety assessment at intersections. RF technologies offer distinct advantages in environmental robustness and inherent data privacy. Radar sensors, for instance, are immune to poor visibility and adverse weather [7,8]. VIDETEC-2 project took the initiative in the (RF)-based VRU protection application [9] by installing static radars at fixed infrastructure positions, where it enables continuous VRU monitoring even when obstacles occlude the vehicle’s view [5]. Furthermore, radar’s ability to capture detailed kinematic features enables advanced VRU tracking and classification using machine learning [10,11,12,13]. To maximize coverage, roadside radars are typically installed at elevated positions, which inherently create near-field blind spots directly beneath the mounting location.
In parallel, Ultra-Wideband (UWB) technology, defined by IEEE 802.15.4, has emerged as a powerful tool to improve robustness in occluded or non-line-of-sight situations [14,15]. By transmitting ultra-short pulses over a wide frequency band, UWB achieves sub-decimeter accuracy for time-of-flight ranging and provides a detailed environmental fingerprint [16,17,18]. For sensing applications, UWB can be deployed as a multi-static radar network to detect moving reflections [19], or device-free localization (DFL) and multipath-enhanced DFL (MDFL) can be utilized to detect intersecting road users via link attenuation [17,20,21]. Given the previously demonstrated sub-meter indoor user localization [22], it is evident that UWB’s dual capability for communication and precise sensing makes it an ideal candidate for Joint Communication and Sensing (JCS). Note, however, that relying solely on UWB for environmental sensing necessitates a densely deployed network of nodes to maintain reliable localization accuracy across the observation area.
Building on the complementary strengths of radar and UWB, this paper presents a novel RF-based sensor fusion framework designed for the continuous tracking of both vehicles and VRUs at urban intersections, also overcoming the individual sensor specific limitations. The proposed system leverages a hybrid Kalman filter architecture, combining an Extended Kalman Filter (EKF) to process radar detections with an Unscented Kalman Filter (UKF) to integrate information from power changes measured in a widely distributed UWB network. Fusing these modalities provides redundant measurements that help reduce positional uncertainty and compensate for near-field radar blind spots, supporting continuous trajectory estimation in occluded traffic scenarios.
In order to evaluate the localization and tracking performance of the proposed radar–UWB fusion algorithm, the sensing results are compared against ground-truth trajectories from an RTK-GNSS high-precision localization system. The experimental evaluation focuses on a single GNSS-equipped VRU per run. Restricting each trial to a single target isolates the behavior of the fusion pipeline, particularly its handling of blind spots, and provides a clear error metric without the additional complexity of multi-target data association. Please note that although the evaluation considers a single trajectory in isolation to ensure precise error measurement, the implemented fusion architecture inherently supports the simultaneous detection and tracking of multiple road users.
The remainder of this paper is organized as follows. Section 2 details the sensor setup, data acquisition, and the proposed hybrid Kalman filter fusion methodology. Section 3 describes the experimental setup and the practical challenges encountered during field testing. The results are presented in Section 4, and the performance of the tracking system is evaluated, demonstrating its viability as a reliable safety mechanism for complex traffic scenarios. Finally, Section 5 concludes.
2. RF-Based Vehicle and VRU Tracking System
This work addresses the detection and safety of VRUs at signalized intersections, specifically focusing on scenarios where right-turning vehicles and crossing VRUs share a concurrent green-signal phase. Due to the lateral offset of the pedestrian/cyclist lane from the main carriageway, the driver’s side-mirror field of view is often occluded (see Figure 1). To mitigate the occluded field of view, we propose a multimodal sensor fusion framework that integrates detections from infrastructure-based radar units with measurements from a distributed network of UWB nodes. These fused data streams are processed by a joint-state tracking algorithm that simultaneously estimates the trajectories of both vehicles and VRUs. When the tracker identifies a predicted trajectory intersection within a predefined safety margin, a targeted warning can be issued to the vehicle to prevent imminent conflict. In the following, we first describe the processing of the individual sensors, i.e., radar and UWB, and then we outline the sensor-fusion algorithm used for vehicle and VRU tracking.
2.1. Radar Processing
Measurements obtained from the radar undergo several pre-processing steps before entering the tracking algorithm. In this subsection, the steps of achieving the final shape and features of the radar measurements are explained.
2.1.1. Radar’s Signal Model and Detections
The Frequency Modulated Continuous Wave (FMCW) radar modules from IMST GmbH operate in the 77/79 GHz frequency bands [23], leveraging sophisticated signal processing to detect and track objects with high precision.
The transmitted chirp signal is generated through frequency modulation and is represented as follows:
where denotes the real-valued amplitude of the transmitted signal, denotes the starting frequency, B is the bandwidth, and represents the duration of the chirp. The received signal from a target reflection is a time-delayed and phase-shifted version of the transmitted signal:
where denotes the real-valued amplitude of the received signal, is the round-trip delay. Mixing the received and transmitted signals yields the intermediate frequency signal :
where represents the phase of the transmitted signal. The difference in frequency between the transmitted and the received signals is the beat frequency . It serves as a vital factor in computing the range d of the target, and it can be mathematically expressed as
where S is the slope of the frequency change over the chirp duration and c is the speed of light. We can obtain the beat frequency by performing Fast Fourier Transform (FFT) on the signal. Hence, the range d to the target can then be calculated:
In order to obtain the velocity measurement, the phase shift generated by the small displacement of the moving target should be calculated using
where represents the change in the target’s position, and denotes the wavelength of the signal.
Assuming the target is moving with a radial speed v within the chirp duration , the change in its position can be expressed as
Substituting in Equation (6) yields
The maximum measurable velocity is determined by satisfying a condition, which states that the phase shift remains within a certain range, , for unambiguous velocity measurement. Thus, the maximum measurable velocity can be calculated as
According to the manufacturer’s manual [24], the radar module’s firmware performs a series of signal processing steps. Starting with the Analog/Digital Converter (ADC), the signals from the radar’s section are sampled, digitized and stored in the processor’s memory. Then Range and Doppler FFTs are applied along the time domain and the chirp sequence respectively, then it the data from all receiver channels are combined for noise suppression to create a Range-Doppler Map (RDM). Potential targets are identified Constant False Alarm Rate (CFAR), with additional parameters like magnitude, elevation, and azimuth angles calculated.
During the experiment, the radar was configured for continuous measurement with an interval of 60 ms, allowing for real-time data acquisition and processing. The radar processing mode was set to obtain detections, with a range resolution of 0.16 m and a velocity resolution of 0.665 m/s.
2.1.2. Linearization
In radar signal processing, linearization is a crucial step for transforming the raw radar detections from their native polar coordinate system (range, Doppler-velocity, azimuth, and elevation) into a Cartesian coordinate system. This transformation facilitates the application of tracking algorithms, such as the Kalman filter, which operate more effectively in a linear, Cartesian space [25]. The necessity of linearization arises from the nature of the Kalman filter, which assumes linear state variable models [26]. The Kalman filter predicts the future state of an object based on its current state and updates this prediction using new measurements. In a Cartesian coordinate system, the motion dynamics of an object can be described using linear equations. Conversely, the measurements in polar coordinate system introduce nonlinearities that complicate the tracking process [27].
Nonlinear Measurement Model
The radar system provides measurements in terms of range (r), azimuth ( ), elevation ( ), and range rate ( ). The nonlinear measurement model can be expressed as
where is the measurement vector, is the nonlinear function relating the state vector to the measurements and is the measurement noise vector, assumed to be Gaussian with zero mean and covariance matrix . The state vector typically includes the position and velocity components in Cartesian coordinates:
The nonlinear measurement function is defined as
is Gaussian distributed with measurement noise covariance
The measurement noise covariance matrix is given by
where is the variance of the azimuth measurement noise, is the variance of the elevation measurement noise, is the variance of the range measurement noise and is the variance of the range rate measurement noise.
Linearization of the Measurement Model
To apply the Extended Kalman Filter (EKF), we need to linearize the nonlinear measurement function around the current state estimate . This is done by computing the Jacobian matrix of with respect to the state vector [25].
The Jacobian matrix is defined as
where is the current state estimate at the time instance k [28].
Given the measurement function in (12), we compute the partial derivatives of each component of with respect to each state variable in (11) [29].
The linearized measurement model is then
Linearizing Position and Velocity Components
The process of linearizing radar data into position components involves converting polar coordinates ( ) into Cartesian coordinates ( ).
In a general scenario, linearizing the velocity components involves complex calculations due to the dependency on the derivatives of the radar measurements. However, for our application, we utilize the knowledge of the environment to simplify the problem. The locations of the static radars, the path of the vehicles, and the targets’ heading are known and relatively constant. Utilizing the knowledge of the environment, it is possible to estimate the true velocity of the target. Considering the targets are moving in a straight line on the x-axis towards the radar, as in the scene shown in Figure 1, the estimated target’s true velocity can be calculated directly from the range rate and the radar measurements [30,31] as follows:
Please note that highly tangential motion relative to the radar line-of-sight, abrupt stop-go, or strong turning can affect the accuracy of the Doppler-to-Cartesian velocity mapping.
2.1.3. Clustering
In this application, the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm was applied to radar data to cluster several radar detections reflected from a single target. The DBSCAN algorithm operates based on two primary parameters: the radius and the minimum number of points minPts. The algorithm categorizes points into core points, border points, and noise based on these parameters. For a point p:
- Epsilon neighborhood: the set of points within a distance from p:
where D is the dataset and is the distance between points p and q. 2. Core point: a point p is a core point if
Based on these definitions, DBSCAN identifies clusters as maximal sets of density-connected points and labels points that do not belong to any cluster as noise [32,33]. However, targets whose Radar Cross-Section (RCS) is only enough to reflect a single detection can be considered as noise or outliers by DBSCAN. Hence, such outliers (possible detections) are still considered as an input into the tracking algorithm by setting the . The tracks initiator/deleter, in return, handles such detections based on the results from the data association algorithm in case they are clutter.
Prior to applying the clustering algorithm, normalizing the data is essential due to different axes the data is represented on, i.e., position axes and velocity axes, to ensure that all features contribute equally to the distance calculations. This step is crucial because DBSCAN relies on distance measures to define neighborhoods and identify core points, and features with different scales can bias these calculations. We employed Min-Max scaling for normalization, which transforms each feature value s to a value within the range [0, 1] using the formula
where and are the minimum and maximum values of the feature, respectively. Normalizing the data helps mitigate the dominance of features with larger scales and facilitates the selection of appropriate values for the DBSCAN algorithm, which is set to after normalization [34,35].
2.2. Ultra Wideband Measurements
Complementing the radar system described above, we consider a network of UWB nodes supporting the localization of VRUs. The transceiving UWB nodes are spatially distributed along the pedestrian/cyclist path, and the locations are precisely measured in advance. For each network link l of the UWB system, i.e., the link between transmitting node and a receiving node , we observe the received power . Since we want to obtain information about the target’s location from the induced fading, we need to observe the changes in the received power in particular. To establish a reference power level, we therefore collect received power data during an initialization period and calculate the average, i.e., , as detailed in [21]. By subtracting the reference power level from the measured power, we can define the power changes in logarithmic domain as
for all considered network links .
2.2.1. Measurement Model
Given the measurement vector of the UWB system in (22), we can define the measurement model as
where is the nonlinear function relating the state vector to the measurements and is the measurement noise vector of the UWB system, assumed to be Gaussian with zero mean and covariance matrix
where is the variance of the l-th network link.
The nonlinear measurement function for an individual network link l is defined as
with as the maximum modeled power change in dB and as the spatial decay rate. The state-dependent excess path length is defined as
where and refer to the known positions of the receiving and transmitting node, respectively, and refers to the target position being part of the state vector .
2.2.2. Link Selection
The purpose of the UWB network is to complement the radar system by providing additional location information from target induced fading measurements. As the network is widely distributed along the bicycle lane, it is essential to preselect the most relevant network links for the update process, improving the computational efficiency and the accuracy of the tracking system.
The pre-selection process is guided by the proximity of the network links to the target detections provided by the radar system or the tracking system, respectively. This proximity is quantified by the excess path length of the current target state at time instance k. A link is selected for update if the excess path length is less than or equal to a threshold . That means, using (26), the selection can be expressed by the boolean operation
where is set to 1 if the condition is true, which indicates that the target is expected to be in the proximity of the link, and 0 otherwise. Accounting for the physical dimensions of the considered targets, i.e., pedestrians and cyclists, in this work, we set the threshold to . Finally, the overall selection vector is
In order to select the relevant network links, the selection vector is applied to the measurement vector in (22) by
where ∘ denotes the element-wise product. Similarly, we can select the relevant elements of the measurement noise covariance matrix in (24) as
where is the diagonal selection matrix constructed from the selection vector of (28).
2.3. Sensor Fusion
After processing the measurements from the radar sensors, the class of the target, based on the single-frame target classification model developed in [12], triggers the tracking algorithm. A Kalman-Filter-based tracking algorithm is utilized to fuse the measurements obtained from radar and UWB sensors. However, due to the difference in measurement models between the two types of sensors, two types of updaters are integrated in this algorithm, an EKF updater and a UKF updater. The EKF is always in operation; however, the UKF updater is triggered only upon the availability of the UWB measurements as illustrated in Figure 2. The main tracking algorithm is a multi-target tracking system designed to accurately estimate the trajectories of multiple objects over time. The primary objective of this tracking algorithm is to manage the association between measurements (detections) and tracked objects (tracks), predict future states, and update the tracks based on new measurements. The EKF component follows standard radar tracking practice (prediction and update on radar detections). The contribution of this work is the fusion architecture that augments radar tracking with UWB measurements through an additional UKF update when UWB measurements are available, enabling continuous VRU tracking and reduced uncertainty during radar blind-spot and outage intervals.
In this application, the UWB network nodes are distributed along the sides of the VRU lane to measure the attenuations induced by a passing VRU. Since the network is focused on measuring the position of a passing VRU, the UKF updater associated with the measurements obtained from the network is triggered only when a VRU is present in the scene.
2.3.1. Extended Kalman Filter (EKF)
The tracking algorithm begins with the processed measurements, which are in a 6-dimensional state vector format as in (11). These detections are then processed by the Probabilistic Data Association (PDA) hypothesizer which generates multiple hypotheses regarding the association between detections and existing tracks.
For a given track and N detections, the probability that detection i is associated with the track at time k is given by
where is the detection probability, is the gate probability, is the number of detections at time k and is the likelihood of the i-th detection at time k. The likelihood is calculated by
where is the clutter density, is the likelihood ratio of the measurement originating from the track target rather than clutter, is the predicted measurement, and is the innovation covariance matrix [36]. This formulation ensures that the association probabilities are computed exclusively, maintaining the integrity of the tracking process by accounting for both actual detections and missed detection scenarios.
The EKF predictor uses these hypotheses to predict the future states of the tracks, where it utilizes a constant velocity model to predict the future state of each track, which can be described as
where
and is the process noise assumed to be Gaussian with zero mean and covariance [37].
The Joint Probabilistic Data Association (JPDA) calculates the joint probabilities of all possible associations between tracks and measurements. For each track i and measurement j, the association probability is
where is the prior probability of hypothesis , and is the likelihood of the measurement given the hypothesis. The association probabilities are used to update the weights of the hypotheses [38]. Once associations are determined, the EKF updater refines the state estimates of the tracks based on the new measurements. The EKF updater uses the linear Gaussian measurement model, as in (16), to update the predicted state with new measurements. The update process is
where is the Kalman gain, is the error covariance matrix, and is the measurement noise covariance matrix [29].
2.3.2. Unscented Kalman Filter (UKF) Update
After performing the EKF prediction step, as in Figure 2, using the constant velocity model described in (33), the predicted state and the associated covariance matrix serve as the prior for the UKF update. The UKF is utilized to incorporate the measurements of changes in the received power of the UWB system.
For the UKF update, we first generate a set of sigma points that capture the uncertainty in the state estimate. These sigma points are deterministic samples chosen to represent the mean and covariance of the state distribution. The sigma points are computed as follows
where n is the dimension of the state vector, and is a scaling parameter defined by , with and as tuning parameters that control the spread of the sigma points [39]. For the experimental evaluation presented in Section 4, the tuning parameters were set to standard literature values of to minimize non-local sampling effects, and to guarantee positive-definite covariance matrices [39]. Each sigma point is then propagated through the nonlinear measurement model defined in (25) as
considering the relevant network links according to the link selection vector defined in (28). This yields a set of transformed sigma points in the measurement space. The predicted measurement mean and the innovation covariance are then computed as the weighted average of these transformed sigma points as
and
where and are the weights associated with the mean and covariance calculations, and is the measurement noise covariance of (30) according to the link selection.
The Kalman gain is calculated as
where denotes the cross-covariance matrix between state and measurement, which is defined as
Finally, the state estimate and the covariance matrix are updated using the Kalman gain and the power change measurements of (29) as
and
The updated state and covariance reflect the integration of the nonlinear power change measurements of the UWB system. As shown in Figure 2, the process now proceeds with the prediction step defined in (33), using the updated state and covariance as input, maintaining the cyclic nature of the hybrid tracking system.
3. Measurement Setup
Two static radar sensors are installed on a gantry at an intersection. The gantry is located on one of the incoming roads to the intersection, on which the two radar sensors are installed on each side; one is facing east (the pedestrian/cyclist path), and the other is facing west (the intersection). The radar facing east monitors the trajectories of VRUs and vehicles that approach the right-turning lane. Conversely, the radar facing west observes VRUs that arrive from the opposite direction, i.e., VRUs approaching the intersection from the west, as well as vehicles that travel westward, from the east, in the same right-turning lane. To maximize the observable range for both VRUs and vehicles, each radar was yaw-adjusted so that the main lobe of its antenna pattern points toward the farthest detectable point of an approaching target. This configuration inevitably creates a blind-spot directly beneath each sensor. The respective areas of radar coverage are highlighted in red in Figure 1 and Figure 3. Typically, mutual interference from approaching vehicles equipped with active radar sensors may occasionally cause missed detections or increased clutter. In our observations, however, these effects were typically brief, likely because the gantry-mounted radar operated at a higher elevation than the vehicle-mounted sensors and because the manufacturer’s optimized CFAR algorithm reduced clutter in such situations. In addition to the two radar sensors, 13 UWB nodes are installed alongside the pedestrian lane, as indicated by the blue dots in Figure 3.
The height of the gantry is 8.5 meters. Since the radars are installed on top of the gantry, a significant blind spot below the radars is created, which leads to a loss of radar detections and, hence, to an increasing uncertainty in the VRU’s location. To avoid this, detections from the UWB nodes are incorporated into the tracker while the VRU is in the radars’ blind spot area.
The UWB sensing network comprises 13 fixed nodes that are mounted at different heights, ranging from to approximately , and positioned at various points around the intersection. The locations of the UWB nodes are depicted in Figure 3. Each UWB node consists of a Qorvo DWM1000 UWB transceiver [40] and a Raspberry Pi computer. The DWM1000 complies with the IEEE 802.15.4-2011 UWB standard. [41]
The Raspberry Pi runs driver software that controls the transceiver, generates and schedules the transmission of sensing messages according to a Time-Division Multiple Access (TDMA) scheme, and stores the TX and RX timestamps for every received message. In addition to the timestamps, the DWM1000 stores the Channel Impulse Response (CIR) as complex 16-bit raw samples in an internal accumulator memory that can be accessed by the Raspberry Pi. Each UWB node was tuned to a frequency of and was broadcasting a sensing message every 120 using 500 bandwidth. The TX power was set to −8 dBm/MHz. For further technical information on UWB nodes, the TDMA scheme, timing and ranging, please refer to [42]. The individual parameters required for the measurement model in (25) are retrieved empirically. The values for maximum modeled power changes, , range from dB to dB, and the values for the decay rate, , from to . The elements of the covariance matrix in (24) take values between dB and dB.
The aim of this experiment was to detect, locate and track a VRU in a real-life traffic situation using the sensor setup described above. The left image in Figure 4 shows an example of a cyclist riding toward the intersection in the VRU lane, while the right image displays the corresponding radar detections in magenta. Please note the blind spot beneath the radars, highlighted in red in Figure 4, appearing as a conspicuous gap in the magenta points. In total, we considered seven runs in this experiment: three runs with a cyclist, two runs with an e-scooter, and two runs with a pedestrian.
In order to evaluate the proposed EKF/UKF radar–UWB sensor fusion to track the VRU, an independent ground truth system is required. Therefore, the VRU was equipped with an RTK-based GNSS system that records its position at a rate of 10 during each run. In addition, these experiments were performed at Providentia++ test field, which is a camera-based high-accuracy traffic detection and tracking system, to ensure an advantage of the test field’s camera-based ground truth system [43].
During the experimental runs a number of ground-truth samples were lost or corrupted. Missing points originate from temporary occlusions of the VRU in the camera-based system and from temporary loss of the RTK-GNSS fix. To maintain continuous reference tracks for the evaluation, missing segments were reconstructed via linear interpolation between the nearest valid, high-accuracy RTK points [44,45]. To ensure the integrity of the evaluation data, we quantified the spatial and temporal gaps introduced by these outages. Histogram analysis of the tracking data revealed that interpolation events were infrequent, with the absolute worst-case spatial gap measuring only . Given the physical dimensions of a typical VRU and their constrained kinematic mobility profiles, i.e., low velocities and bounded accelerations, a maximum interpolation distance of falls well within the spatial footprint and predictable movement bounds of the target. Consequently, linear interpolation over these minor gaps introduces negligible error and is strongly justified.
4. Results
The experimental results were evaluated using multiple metrics to capture different performance aspects of the proposed method. This multi-metric approach mitigates reliance on the interpolated ground truth segments, ensuring a more robust and unbiased evaluation. Before presenting the quantitative analysis, Figure 5 provides a visual representation of two sample runs. The figure clearly reveals how the radar’s blind-spot degrades the raw tracks and how the fused solution restores continuity.
The primary metric used to evaluate the results is the root-mean-square error (RMSE), defined as
where i is the track number at the timestep k, and N is the total number of tracks at the timestep. The RMSE results for two runs are shown in Figure 6. In the radar-only configuration, both the pedestrian and cyclist runs exhibit a significant increase in RMSE in the radar blind spot. In contrast, the radar–UWB sensor fusion keeps the RMSE close to around the blind spot and prevents it from exceeding 1 . At some time steps, particularly when the target is within radar coverage, the radar-only case is slightly more stable, whereas the fusion output exhibits jitter of about .
This jitter is a direct result of the asynchronous data acquisition between the radar, UWB sensors, and the ground-truth system. To address clock alignment, all incoming sensor measurements are timestamped upon arrival at the central processing unit using a shared system clock. Because the sensors operate at varying frequencies without hardware-level synchronization, the EKF/UKF handles the asynchronous updates by using the exact time difference between consecutive measurement timestamps for the prediction step. For evaluation, each tracker output is temporally paired with the nearest GNSS sample. If the time difference exceeds a 50 ms threshold, which corresponds to half the sampling interval of the filter’s 10 update rate, the sample is discarded. While spatial interpolation was utilized to reconstruct missing GNSS segments, no temporal resampling or artificial alignment was applied to the incoming sensor streams. Consequently, the natural end-to-end timing and minor processing latencies are preserved, which manifests as the observed jitter during higher-speed cyclist runs.
For the pedestrian run in Figure 6, the spike at the beginning is due to the EKF and UKF still adapting their parameters while missed radar measurements occur. In this phase, the filter output relies mainly on the UKF, where the motion model has not yet fully converged early in the track. The missing measurements can be confirmed through the elevated variance trace of the position in Figure 7 during these time steps.
In safety-critical applications, evaluating the statistical confidence of the localization estimate is as crucial as measuring the absolute tracking error. To quantify this internal certainty, we evaluated the square root of the spatial covariance trace for both the radar-only and sensor fusion configurations, providing an estimate of the positional uncertainty in meters. Specifically, the trace was computed by summing the variances of the planar Cartesian coordinates at each time step, formulated as follows:
where and denote the variances of x and y position states, respectively, in the covariance matrix associated with the state vector in (11). Figure 7 shows the position-uncertainty trace for the pedestrian and cyclist runs. In the radar-only configuration the covariance grows dramatically with time because, while the target is inside the radar’s blind-spot, the filter performs successive prediction steps without any measurement updates. By contrast, the fusion with UWB attenuation measurements constrains the uncertainty, keeping the square-root of the covariance trace below 1 throughout the blind-spot region.
The overall RMSE for the seven VRU runs for radar-only and radar–UWB fusion is shown as a Box and Whisker plot in Figure 8. It can be observed that the RMSE of the radar–UWB fusion remains below 1 , with very few outliers exceeding the 1 threshold by negligible values. In contrast, the RMSE for radar-only shows a higher overall RMSE, where not only its distribution exceeds the 1 threshold, but its outliers lie even above 2 .
Figure 9 shows the empirical cumulative distribution function (ECDF) and the outage duration distribution of the RMSE for the two approaches. The ECDF in Figure 9a compares the RMSE distribution of radar-only localization with radar–UWB fusion approach. The ECDF curve shows the probability of the error being smaller than a specific error value (on the x-axis). We compared the two approaches in terms of the percentile-based accuracy metrics CEP68 (equivalent to ) and CEP95. The results show that both approaches achieve similar accuracy around – at 68%. However, radar–UWB fusion outperforms the radar-only approach at 95% achieving an RMSE below compared to . Hence, these experiments demonstrate that the fusion of radar and UWB data reduces the error tail and improves robustness.
Figure 9b shows the probability distribution of the outage duration. An outage is defined as the time interval within which the RMSE stays above a specific threshold and is given by the considered application. For a threshold of , the outage at 68% with radar–UWB fusion is , while the radar-only yields a value of . At the 95th percentile, the outage is also reduced from to about . The results show that when the localization becomes unreliable, the radar–UWB fusion recovers faster while the radar-only approach suffers from longer error bursts.
5. Conclusions
This paper presented a novel radar and Ultra-Wideband (UWB) sensor fusion approach designed to enhance the safety of Vulnerable Road Users (VRUs) at intersections. By addressing the limitations of standalone sensors, this implementation serves as a highly reliable proof of concept for continuous VRU tracking in complex, mixed-traffic scenarios.
Managed by a hybrid Kalman Filter architecture, the proposed method significantly reduced VRU tracking uncertainty. While the Extended Kalman Filter (EKF) maintained real-time tracking on both the VRU and vehicles using radar data, the Unscented Kalman Filter (UKF) seamlessly integrated UWB measurements to bridge critical tracking gaps, most notably within radar blind spots. The experimental results demonstrate that fusing these modalities mitigates severe trajectory deviations, resulting in a precise and continuous detection and tracking performance.
Ultimately, these promising results validate the proposed fusion concept and establish a robust foundation for real-world traffic safety applications. By effectively resolving sensor-specific vulnerabilities, the system provides a redundant concept for robust road monitoring required to elevate VRU protection mechanisms and sets the basis for safer urban environments.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1European Commission Annual Statistical Report on Road Safety in the EU 2021 Technical Report Directorate General for Transport Brussels, Belgium 2022
- 2European Commission Road Safety Thematic Report—Cyclists Technical Report Directorate General for Transport Brussels, Belgium 2022
- 3Rashdan I. Vehicle-to-Vulnerable Road Users Channel Modeling in Critical Scenarios Ph.D. Thesis Technische Universität Berlin Berlin, Germany 2023
- 4Rashdan I. Sand S. Link-Level Performance of Vehicle-to-Vulnerable Road Users Communication Using Realistic Channel Models Proceedings of the 2024 18th European Conference on Antennas and Propagation (Eu CAP), Glasgow, UK, 17–22 March 2024 IEEE Piscataway, NJ, USA 202415
- 5Buchholz M. Müller J. Herrmann M. Strohbeck J. Völz B. Maier M. Paczia J. Stein O. Rehborn H. Henn R.W. Handling Occlusions in Automated Driving Using a Multiaccess Edge Computing Server-Based Environment Model From Infrastructure Sensors IEEE Intell. Transp. Syst. Mag.20221410612010.1109/MITS.2021.3089743 · doi ↗
- 6Zhang T. Cheng L. Bang T. Guo L. Hajij M. Cao S. Harris A. Sartipi M. Roadside Sensor Systems for Vulnerable Road User Protection: A Review of Methods and Applications IEEE Access 202513627176273810.1109/ACCESS.2025.3558174 · doi ↗
- 7Reyes-Muñoz A. Guerrero-Ibáñez J. Vulnerable Road Users and Connected Autonomous Vehicles Interaction: A Survey Sensors 202222461410.3390/s 2212461435746397 PMC 9229412 · doi ↗ · pubmed ↗
- 8Waldschmidt C. Hasch J. Menzel W. Automotive Radar —From First Efforts to Future Systems IEEE J. Microwaves 2021113514810.1109/JMW.2020.3033616 · doi ↗
