Personalized trajectory inference framework integrating driving behavior recognition and temporal dependency learning
Jinhao Yang, Junwen Cao, Mingyu Fang

TL;DR
This paper introduces a new framework that improves vehicle trajectory predictions by recognizing driving styles and learning temporal dependencies, enhancing driving safety.
Contribution
The novel DS-TCTM framework integrates driving style recognition and personalized trajectory prediction using a multi-level neural architecture.
Findings
DS-TCTM achieves a mean RMSE of 4.46 and NLL of 3.89 with significant error reduction after hyperparameter optimization.
The model outperforms baseline models in long-term trajectory predictions.
Driving style classification into conservative, moderate, and radical categories improves prediction accuracy.
Abstract
This study proposes a Driving style-Tri Channel Trajectory Model (DS-TCTM) to enhance vehicle trajectory prediction accuracy and driving safety. The framework operates through three rigorously designed stages: (1)Data preprocessing involving kinematics feature extraction, (2)Driving style recognition utilizing acceleration variation rate and average time headway combined with K-Means++ traffic density clustering and K-neighbor Gaussian mixture model (K-GMM) analysis to classify driving behaviors into conservative, moderate, and radical categories, and (3)Personalized trajectory prediction employing a multi-level neural architecture with dedicated sub-networks for distinct driving styles. Experimental evaluations demonstrate DS-TCTM’s superior performance across multiple dimensions. The model achieves a mean RMSE of 4.46 and NLL of 3.89 across varying prediction horizons, with 35.8%…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Fig 1
Fig 2
Fig 3
Fig 4
Fig 5
Fig 6
Fig 7
Fig 8
Fig 9
Fig 10
Fig 11Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Time Series Analysis and Forecasting · Anomaly Detection Techniques and Applications
Introduction
Traffic safety remains a global critical issue. According to World Health Organization statistics [1], approximately 1.35 million fatalities and tens of millions of injuries occur annually due to traffic accidents, resulting in economic losses accounting for nearly 3% of global GDP. The advancement of autonomous driving technology offers new prospects for safety improvements, where vehicle trajectory prediction plays a pivotal role in enhancing system reliability.
In trajectory prediction research, physics-based approaches initially dominated the field. Qiao et al. [2] and Wiest et al. [3] integrated Gaussian mixture models with kinematic characteristics, achieving precise predictions in simple scenarios but demonstrating limited adaptability in complex environments. Gao et al. [4] enhanced motion trend capture through time-series analysis, yet struggled with abrupt behavior prediction. Xu [5] and Xie et al. [6] combined physical models with maneuvering behaviors to improve environmental adaptability at the cost of computational complexity. Li et al.’s [7] real-time planner improved dynamic environment responsiveness but required trade-offs between real-time performance and accuracy. While physics-based methods excel in controlled scenarios, their heavy reliance on precise physical inputs limits effectiveness in dynamic traffic environments. This constraint has driven researchers toward probabilistic modeling to better handle uncertainties. Zong et al. [8] developed a dual-layer hidden Markov model for driver intention recognition, improving prediction accuracy through intention variation analysis, albeit with compromised real-time performance. Xie et al. [9] employed dynamic Bayesian networks with distributed genetic algorithms to construct driving behavior perception models, requiring substantial computational resources for training. Hu et al. [10] applied decision trees to predict lane-change avoidance behaviors but faced limitations in complex behavioral scenarios. Schlechtriemen et al. [11] proposed probabilistic regression for rare event prediction (e.g., lane-change timing), though data scarcity impacts model training.
Recent advancements leverage data-driven approaches to overcome traditional limitations. Chandra et al.’s [12] Traphic model captures traffic interactions through weighted deep learning mechanisms. Li et al. [13] introduced Grip for graph-structured interaction modeling in multi-agent environments. Amirian et al. [14] utilized GANs to learn multimodal pedestrian trajectory distributions. Narayanan et al. [15] implemented Divide-and-Conquer strategies for lane-adaptive predictions. Fang et al.’s [16] TPNet enhances diversity through trajectory proposal networks.
Despite the notable advancements of current data-driven vehicle trajectory prediction methods, most neural network models predominantly rely on training datasets comprising numerical data such as speed, movement direction, and position, while abstract factors like driving style can result in differing trajectories under identical external conditions. Guo et al. [17] comprehensively reviewed the identification and evaluation of driving characteristics and their applications in intelligent vehicles, emphasizing that driver behavior significantly impacts vehicle trajectory. Kim et al. [18] utilized the DeepConvLstm network and a generative-based model for driving style recognition and trajectory prediction. Chen et al.[19] proposed a fusion algorithm incorporating driving style and address uncertainties of vehicle dynamics to lane change trajectory prediction. Shao et al.[20] utilized K-means clustering to classify driving styles, but did not consider the degree of traffic congestion. Yuan et al.[21] develope a CNN-LSTM model for traffic conflict prediction considering the risk factors in driver merging behavior. Therefore, incorporating drivers’ behavioral characteristics and driving styles is crucial for improving prediction accuracy.
Given the aforementioned context, this study aims to develop a vehicle trajectory prediction approach that incorporates various driving styles, with the objective of filling the research gap regarding the influence of driving style discrepancies on trajectory predictions.
The main contributions of this study are as follows:
(1) The DS-TCTM model innovatively integrates driving style recognition with trajectory prediction, effectively capturing the impact of driving styles on trajectory patterns, thereby providing more accurate and personalized prediction results.(2) The model employs a K-GMM clustering algorithm based on K-Means++, overcoming the limitations of traditional GMM models and achieving more accurate driving style recognition.(3) The DS-TCTM model utilizes a GRU-BiGRU-BiLSTM hybrid network to effectively capture long and short-term dependencies, and employs a vertical network layer to generate personalized trajectory predictions for different driving styles.
Architecture
To enable precise short-term vehicle trajectory prediction, this study presents a driving style-based personalized prediction model. The framework incorporates three essential components: (1) Data Preprocessing Module for extracting dynamic kinematic features and traffic density metrics from raw sensor inputs; (2) Driving Style Recognition Module that transforms feature vectors into categorical style labels through machine learning techniques; (3) Personalized Trajectory Prediction Module generating individualized future trajectories conditioned on the identified driving styles. The architectural workflow is illustrated in Fig 1.
Model architecture diagram.
Feature extraction module
To effectively classify driving styles, the data preprocessing module first categorizes traffic density based on vehicle speed within the road segment. Subsequently, it extracts two critical kinematic features: acceleration change rate and average time headway. Finally, the module integrates traffic density with these driving characteristics to comprehensively categorize driving styles.
Traffic density classification.
Traffic density significantly impacts driving styles. In high-density scenarios, drivers frequently change lanes to maintain shorter gaps for traffic fluidity. Conversely, low-density conditions allow larger gaps and stable speeds, yielding calmer driving patterns.
Since traffic density cannot be directly learned from labeled data or quantified through prior knowledge, this study employs the unsupervised K-Means++ clustering algorithm. The specific steps are as follows:
(1) Initial State Selection
Randomly select three vehicle states , and extract their velocities as classification centers in cluster partitioning.
(2) Minimum Distance Calculation
Calculate the shortest distance between the vehicle speed at each moment and the current cluster centers:
where are three classification centers in cluster partitioning, , and represent the times when the vehicle enters and leaves, respectively.
(3) Sample Probability Calculation
Determine the probability for each sample point:
Select three new samples with the maximum probabilities sequentially as updated cluster centers. Repeat the assignment and update steps until convergence is achieved, completing the algorithm.
Thus, traffic density is classified into three categories: congested, slow-moving, and free-flowing.
Driving style feature extraction.
Feature extraction aims to accurately identify and analyze individual driving styles. We select acceleration change rate and average time headway as key features. The acceleration change rate reveals drivers’ acceleration/deceleration habits, while the average time headway reflects their ability to maintain lane stability and safe distances.
(1) Calculation of Acceleration Change Rate
where represents the acceleration change rate of vehicle ci, and N denotes the total observation time of vehicle i in traffic flow d.
(2) Calculation of Average Time Headway
where indicates the average time headway between vehicle ci and the preceding vehicle, is the average following distance, and represents the average speed.
The unsupervised clustering analysis of traffic density combined with the extraction of and TH, establishes the data foundation for subsequent driving style recognition and trajectory prediction.
Driving style recognition module
To accurately analyze driving style variations under different traffic densities, we propose a traffic density-aware recognition method. This approach takes traffic density and driving style features as inputs, classifying all vehicles’ driving styles into three categories (conservative, moderate, radical) via clustering algorithms.
Traffic density weighting.
We categorize traffic density into three types, each assigned a weight to reflect its importance in comprehensive score calculation. During free-flow conditions, vehicles operate more freely, better reflecting driver styles, thus assigned higher weights. Under moderate density, driving becomes constrained. In congested scenarios, vehicles face severe movement restrictions requiring cautious operation, warranting the lowest weights. The specific weighting scheme is:
Driving style scoring.
Validation via the K-Means elbow method shows that as the cluster number k increases, error decreases but model complexity rises. At k = 3, a balance between accuracy and complexity is achieved with relatively low error. Consequently, driving styles are categorized into three types: Conservative drivers prefer smooth driving patterns, maintaining larger TH values and lower . Moderate drivers dynamically adjust their behavior across scenarios, exhibiting intermediate TH and levels. Aggressive drivers frequently execute rapid acceleration and abrupt braking maneuvers, characterized by smaller TH and higher . The specific scoring criteria for each driving style category are detailed below:
Comprehensive driving style score calculation.
By integrating traffic density weights and driving style scores, the comprehensive driving style score is calculated as:
The comprehensive score effectively captures drivers’ actual characteristics across varying traffic conditions, thereby improving subsequent vehicle trajectory prediction accuracy.
Personalized trajectory prediction module
The personalized trajectory prediction module generates short-term trajectory predictions using driving style labels and environmental features as inputs. The vehicle’s position (x,y), velocity (v), and acceleration (a) are input into personalized trajectory prediction module as input features.
Traditional recurrent neural networks (RNNs) struggle with long-term dependencies due to information fading. To address this, we employ a parallel bidirectional RNN architecture that enhances long-term dependency capture. The model architecture is illustrated in Fig 2.
Personalized trajectory prediction model architecture.
Horizontally, the architecture implements a GRU-BiGRU-BiLSTM hybrid network, where bidirectional layers re-examine historical trajectories to improve prediction accuracy. Vertically, it conducts few-shot end-to-end learning for vehicles sharing the same driving style. Trajectory features from three distinct driving styles are fed into separate horizontal networks, enabling style-specific trajectory predictions.
Horizontal layer 1: GRU architecture.
The first horizontal layer employs GRU units, where each driving moment of any vehicle corresponds to a GRU cell, inputting the historical state at that moment. Update Gate Calculation:
where denotes the update gate’s weight matrix, bz the bias vector, [] represents matrix concatenation, ht−1 is the hidden state vector from the previous timestep, and is the Swish activation function.
Reset Gate Calculation:
where Wr is the reset gate’s weight matrix and br the corresponding bias vector.
Candidate State Generation:
here, Wh represents the hidden state weight parameters, bh the bias vector, and the hyperbolic tangent function. The basic GRU gate structure is illustrated in Fig 3.
Basic GRU gate structure.
Horizontal layer 2: BGRU architecture.
The second horizontal layer employs Bidirectional GRU (BiGRU), comprising an input layer, forward hidden layer, backward hidden layer, and output layer. Data flows from the input layer to both directional hidden layers, where two opposing GRU networks jointly determine the output. The BGRU network architecture is illustrated in Fig 4. The mathematical formulation is:
BGRU network architecture.
where xt is the input vector at timestep t, hm and hn denote the forward and backward hidden states respectively, and are their corresponding weight matrices, and bt represents the bias term.
Horizontal layer 3: BiLSTM architecture.
The third horizontal layer utilizes BiLSTM, processing the same input sequence through forward and backward LSTM hidden layers, with both outputs contributing to the final layer. Compared to unidirectional LSTM, BiLSTM’s bidirectional structure enhances data utilization efficiency, overcomes traditional LSTM’s limitations in temporal data processing, and exhibits stronger robustness and generalization capabilities. The BiLSTM network architecture is illustrated in Fig 5.
BiLSTM network architecture.
Each timestep corresponds to an LSTM gate, where the vehicle’s historical state serves as the input signal xt. The BiLSTM operations are defined as:
here ht and represent forward and backward LSTM outputs at timestep t, Wh and are their respective weight matrices, and by denotes the bias term.
Vertical network layer architecture.
To comprehensively consider driving style impacts on trajectories, we feed style-specific trajectories into three independent horizontal networks for few-shot end-to-end deep learning. The vertical hierarchy comprises three parallel identical network architectures, integrating GRU, BGRU, and BiLSTM advantages to achieve precise short-term trajectory predictions. By training three distinct prediction models with different driving style data, this design fully utilizes existing data, enhances few-shot learning capability, and significantly improves prediction accuracy for varied driving styles. The Vertical network layer architecture is illustrated in Fig 6.
Vertical network layer architecture diagram.
Experimental design and analysis
Dataset selection
This study utilizes the “Ubiquitous Traffic Eyes” dataset [22] and the highD dataset, and the details of these two datasets are described as follows:
(1) The "Ubiquitous Traffic Eyes" dataset encompasses diverse traffic scenarios and driving styles across various weather conditions, time periods, and geographical locations. The dataset contains trajectory records from over 100,000 vehicles, including critical parameters such as position, velocity, acceleration, steering angle, and timestamps, providing a robust foundation for trajectory prediction and driving style recognition. Fig 7 shows the scene environment of the Ubiquitous traffic eyes dataset.
Ubiquitous traffic eyes dataset.
(2) The highD dataset is a new dataset of naturalistic vehicle trajectories recorded on German highways. Using a drone, typical limitations of established traffic data collection methods such as occlusions are overcome by the aerial perspective. Traffic was recorded at six different locations and includes more than 110 500 vehicles. Each vehicle’s trajectory, including vehicle type, size and manoeuvres, is automatically extracted. Using state-of-the-art computer vision algorithms, the positioning error is typically less than ten centimeters. Although the dataset was created for the safety validation of highly automated vehicles, it is also suitable for many other tasks such as the analysis of traffic patterns or the parameterization of driver models.
Parameter setting
The backbone networks of the models employed in this paper are GRU, BGRU, and Bi-LSTM, with GRU and BGRU comprising 150 nodes each, and Bi-LSTM consisting of 200 nodes.The optimizer used is Adagrad. Batch_size is set to 8, the number of epochs to 150, and the initial learning rate to 0.001. The experiment develops a neural network tra-jectory prediction model using Python 3.6 and TensorFlow 1.4.0, with the hardware envi-ronment consisting of a PC with 11th Gen Intel(R) Core(TM) i7-11800H @2.30 GHz and NVIDIA GeForce RTX 3060, running Windows 10.
Data preprocessing
In order to eliminate GPS measurement errors and occasional anomalies, a three-step denoising process was adopted in the data preprocessing stage:
(1) Outlier detection and rejection
The mean and standard deviation of velocity v and acceleration a in each trajectory are calculated, and the elimination is satisfied The mean and standard deviation of velocity v and acceleration a in each trajectory are calculated, and the time point satisfying
is excluded to exclude obvious measurement jumps.
(2) Savitzky–Golay smoothing filtering
Apply a Savitzky-Golay filter with a window length of 5 and a quadratic polynomial to the x and y coordinates of the trajectory sequence after removing outliers, further suppressing high-frequency jitter and ensuring the smoothness of the trajectory.
(3) Linear interpolation complementation
For short-term missing data ( 3 frames) caused by outlier removal or filtering, one-dimensional linear interpolation is used to fill in the gaps, ensuring the continuity of the input sequence; if the missing data exceeds 3 frames, the corresponding sliding window is discarded.
This study utilizes the publicly available Ubiquitous Traffic Eyes dataset and highD dataset. Ubiquitous Traffic Eyes collected a total of 1,810,742 continuous vehicle trajectories, while highD is stored as 60 files, one of which has 300,000-400,000 continuous vehicle trajectories. The data is divided into training, validation, and test sets in an 8:1:1 ratio.
Evaluation metrics and baseline models
Evaluation metrics.
This experiment employs Root Mean Square Error (RMSE), Negative Log-Likelihood (NLL), Mean Absolute Error(MAE) and Final displacement error (FDE) to assess model performance. RMSE quantifies the deviation between predicted and ground-truth trajectories, NLL evaluates prediction accuracy of vehicle action types, MAE represents the average value of the absolute error between the predicted and observed values, and FDE measured the deviation between the predicted endpoint and the true endpoint in the last step of the prediction. Each model group undergoes 100 experimental trials, with metric means recorded. The calculations are defined as:
where and denote the predicted lateral and longitudinal positions, lossnll represents the negative log-likelihood loss function, Ltrue and Lpre denote the ground-truth trajectories and the model-predicted trajectories respectively, and denote the longitude and latitude of the predicted location of the end position, xend and yend denote the longitude and latitude of the true location of the end position.
Baseline models.
This paper adopts LSTM [23], Social-LSTM [24], CS-LSTM [25], SV-LSTM [26] (which adds vehicle speed information to the Social-LSTM structure), TCTM [27] (without considering driving styles), STA-LSTM [28] and LSTM GAN [29] as baseline models for comparing prediction accuracy and computational efficiency.
Driving style clustering model comparison
This experiment, conducted on the Ubiquitous Traffic Eyes Dataset, classifies driving styles by extracting features such as average time headway (TH) and acceleration change rate ( ), comparing the performance between standard GMM and improved K-GMM models.
Contour plot analysis reveals similar overall shapes between Fig 8 (GMM) and Fig 10 (K-GMM). The high-density region in Fig 8 concentrates in the central-lower area with smooth color transitions and uniform data point distribution. In contrast, Fig 10 exhibits slightly dispersed color distribution at boundaries, indicating tighter clustering in K-GMM.
Contour plot of standard GMM clustering.
Peak distribution of standard GMM clustering.
3D histogram comparisons (Fig 9 vs. Fig 11) show analogous amplitude distributions. Fig 9 displays scattered high-amplitude peaks, particularly in large regions, while Fig 11 demonstrates more concentrated peaks. The smoother color gradients in Fig 9 contrast with steeper transitions in Fig 11, suggesting K-GMM’s enhanced discriminative capability in specific regions.
Contour plot of K-GMM clustering.
Peak distribution of K-GMM clustering.
Quantitatively, K-GMM achieves tighter cluster compactness in high-density and high-amplitude regions, effectively grouping similar driving styles. Comparative analyses of Figs 8, 9, 10, and 11 validate K-GMM’s superior clustering precision and amplitude differentiation over standard GMM, confirming its effectiveness on this dataset.
Fundamental performance analysis
This section evaluates the core predictive capabilities of DS-TCTM against baseline models through four key metrics: RMSE, NLL, MAE and FDE.
Ubiquitous traffic eyes dataset.
Table 1 shows the comparison of RMSE in ubiquitous traffic eyes. All models exhibit increasing RMSE with longer prediction horizons. DS-TCTM achieves significantly lower RMSE across all timepoints, demonstrating superior trajectory prediction accuracy.
Table 1: Comparison of RMSE in ubiquitous traffic eyes.
Table 2 shows the comparison of NLL in ubiquitous traffic eyes. Like RMSE trends, NLL values increase with prediction duration. DS-TCTM maintains a lower NLL at most horizons, indicating better probabilistic distribution fitting.
Table 2: Comparison of NLL in ubiquitous traffic eyes.
Table 3 shows the comparison of MAE in ubiquitous traffic eyes. Like RMSE and NLL trends, MAE values increase with prediction duration. Compared to other baseline models, DS-TCTM maintains a lower MAE.
Table 3: Comparison of MAE in ubiquitous traffic eyes.
Table 4 shows the comparison of FDE in ubiquitous traffic eyes. FDE is used to evaluate the effectiveness of control algorithms, and DS-TCTM has smaller FDE instructions to be able to predict vehicle trajectories more accurately.
Table 4: Comparison of FDE in ubiquitous traffic eyes.
HighD dataset.
Table 5 shows the comparison of RMSE in HighD. As in the Ubiquitous Traffic Eyes dataset, all models exhibit increasing RMSE with longer prediction horizons in the highD dataset. DS-TCTM achieves significantly lower RMSE across all timepoints, demonstrating superior trajectory prediction accuracy.
Table 5: Comparison of RMSE in HighD.
Table 6 shows the comparison of NLL in HighD. As in the Ubiquitous Traffic Eyes dataset, NLL values increase with prediction duration and DS-TCTM maintains a lower NLL at most horizons, indicating better probabilistic distribution fitting.
Table 6: Comparison of NLL in HighD.
Table 7 shows the comparison of MAE in HighD. Compared with other baseline models, DS-TCTM maintains a lower MAE at each time on the highD dataset.
Table 7: Comparison of MAE in ubiquitous traffic eyes.
Table 8 shows the comparison of FDE in HighD. As in the Ubiquitous Traffic Eyes dataset, DS-TCTM has smaller FDE in the highD dataset.
Table 8: Comparison of FDE in HighD.
Computational efficiency
This experiment compares the computational efficiency between baseline models and DS-TCTM by measuring processing time per vehicle trajectory.
Table 9 demonstrates the processing time required by each model. V-LSTM achieves the shortest latency, while CS-LSTM exhibits the poorest efficiency. From the previous RMSE and NLL comparisons, the DS-TCTM model demonstrates superior prediction accuracy. Although not the most computationally efficient, DS-TCTM achieves an optimal balance between prediction accuracy and processing speed. In contrast, while V-LSTM has the highest computational efficiency, its prediction accuracy (as indicated by RMSE and NLL values) is inferior to DS-TCTM.
Table 9: Comparison of calculation efficiency.
Conclusions
(1) Model Innovation
Addressing vehicle trajectory prediction challenges, this study proposes the DS-TCTM model that integrates traffic density and driving style impacts for personalized prediction. The architecture combines multi-scale temporal modeling with driving style characterization through an enhanced K-GMM clustering approach (K-Means++ based), effectively resolving sample segmentation issues while capturing long-term dependencies.
(2) Performance Superiority
Experimental results confirm DS-TCTM’s superior performance, with RMSE and NLL values maintained below 4.46 and 3.89 respectively. Compared to conventional LSTM models, it achieves 35.8% error reduction, demonstrating particular strength in long-term predictions.
(3) Future Directions
The current unsupervised clustering approach with manual labeling provides initial style differentiation, yet algorithmic refinement of the K-GMM-based scoring system presents a key avenue for enhancing model precision in future work, particularly for deployment in connected vehicle applications.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Suggested Citation. Global status report on road safety 2018. Geneva: World Health Organization; 2018. Licence: CC BY-NC-SA 3.0 IGO.
- 2Qiao SJ ie, Jin K, Han N. A trajectory prediction algorithm based on Gaussian mixture model. J. Softw. 2015;26(05):1048–63.
- 3Wiest J, Höffken M, Kreßel U. Probabilistic trajectory prediction with Gaussian mixture models. In: 2012 IEEE Intelligent Vehicles Symposium. IEEE; 2012. p. 141–6.
- 4Gao J, Mao Y, Li Z. Trajectory prediction based on Gaussian mixture-time series model. Comput Appl. 2019;39(8):2261–70.
- 5Qiao S, Jin K, Han N. Motion planning under uncertainty for on-road autonomous driving. In: 2014 IEEE International Conference on Robotics and Automation (ICRA). IEEE; 2014. 2507–12.
- 6Xie G, Gao H, Qian L, Huang B, Li K, Wang J. Vehicle trajectory prediction by integrating physics- and maneuver-based approaches using interactive multiple models. IEEE Trans Ind Electron. 2018;65(7):5999–6008. doi: 10.1109/tie.2017.2782236 · doi ↗
- 7Li J, Dai B, Li X, Li C, Di Y. A real-time and predictive trajectory-generation motion planner for autonomous ground vehicles. In: 2017 9th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC). IEEE; 2017. 108–13. 10.1109/ihmsc.2017.140 · doi ↗
- 8Zong C, Wang C, He L. Driver intention recognition based on a two-layer hidden Markov model. Automot Eng. 2011;2011(8):6. doi: CNKI:SUN:QCGC.0.2011-08-013
