# Temporal Tampering Detection in Automotive Dashcam Videos via Multi-Feature Forensic Analysis and a 1D Convolutional Neural Network

**Authors:** Ali Rehman Shinwari, Uswah Binti Khairuddin, Mohamad Fadzli Bin Haniff

PMC · DOI: 10.3390/s26020517 · Sensors (Basel, Switzerland) · 2026-01-13

## TL;DR

This paper introduces a fast and efficient system using a 1D-CNN to detect tampered dashcam videos by analyzing frame differences and motion patterns.

## Contribution

A lightweight, CPU-efficient 1D-CNN framework for real-time dashcam video tampering detection using multi-feature temporal analysis.

## Key findings

- The framework achieves 95-100% accuracy on frame deletion, insertion, and duplication in the D2-City dataset.
- Near real-time performance (≈12.7–12.9 FPS) with minimal memory usage (≈0.085 MB) enables deployment on embedded systems.
- Cross-dataset analysis shows robustness for insertion but degradation for deletion/duplication, suggesting domain adaptation is needed.

## Abstract

Lightweight multi-feature temporal forensic framework that fuses frame-difference magnitude, SSIM drift, optical-flow mean, forward–backward flow error, and temporal prediction error, modeled with a shallow 1D-CNN for dashcam video tampering detection.Near real-time CPU inference (≈12.7–12.9 FPS) with minimal memory overhead (≈0.085 MB average), enabling practical deployment on embedded and forensic systems without GPUs.Strong intra-dataset performance on D2-City: 95.0% accuracy for frame deletion, 100.0% for insertion, and 95.0% for duplication; multiclass detection achieves 96.3% accuracy and class-wise AUCs up to 1.0.Cross-dataset analysis reveals domain shift, with insertion remaining robust (up to ≈97% accuracy on VIRAT), while deletion/duplication degrades on VIRAT and BDDA, motivating domain adaptation strategies.Ablation shows depth and temporal receptive field matter: two Conv1D blocks and kernels of 5–7 outperform shallower/smaller setups without meaningful efficiency penalties.

Lightweight multi-feature temporal forensic framework that fuses frame-difference magnitude, SSIM drift, optical-flow mean, forward–backward flow error, and temporal prediction error, modeled with a shallow 1D-CNN for dashcam video tampering detection.

Near real-time CPU inference (≈12.7–12.9 FPS) with minimal memory overhead (≈0.085 MB average), enabling practical deployment on embedded and forensic systems without GPUs.

Strong intra-dataset performance on D2-City: 95.0% accuracy for frame deletion, 100.0% for insertion, and 95.0% for duplication; multiclass detection achieves 96.3% accuracy and class-wise AUCs up to 1.0.

Cross-dataset analysis reveals domain shift, with insertion remaining robust (up to ≈97% accuracy on VIRAT), while deletion/duplication degrades on VIRAT and BDDA, motivating domain adaptation strategies.

Ablation shows depth and temporal receptive field matter: two Conv1D blocks and kernels of 5–7 outperform shallower/smaller setups without meaningful efficiency penalties.

Automotive dashboard cameras are widely used to record driving events and often serve as critical evidence in accident investigations and insurance claims. However, the availability of free and low-cost editing tools has increased the risk of video tampering, underscoring the need for reliable methods to verify video authenticity. Temporal tampering typically involves manipulating frame order through insertion, deletion, or duplication. This paper proposes a computationally efficient framework that transforms high-dimensional video into compact one-dimensional temporal signals and learns tampering patterns using a shallow one-dimensional convolutional neural network (1D-CNN). Five complementary features are extracted between consecutive frames: frame-difference magnitude, structural similarity drift (SSIM drift), optical-flow mean, forward–backward optical-flow consistency error, and compression-aware temporal prediction error. Per-video robust normalization is applied to emphasize intra-video anomalies. Experiments on a custom dataset derived from D2-City demonstrate strong detection performance in single-attack settings: 95.0% accuracy for frame deletion, 100.0% for frame insertion, and 95.0% for frame duplication. In a four-class setting (non-tampered, insertion, deletion, duplication), the model achieves 96.3% accuracy, with AUCs of 0.994, 1.000, 0.997, and 0.988, respectively. Efficiency analysis confirms near real-time CPU inference (≈12.7–12.9 FPS) with minimal memory overhead. Cross-dataset tests on BDDA and VIRAT reveal domain-shift sensitivity, particularly for deletion and duplication, highlighting the need for domain adaptation and augmentation. Overall, the proposed multi-feature 1D-CNN provides a practical, interpretable, and resource-aware solution for temporal tampering detection in dashcam videos, supporting trustworthy video forensics in IoT-enabled transportation systems.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12846185/full.md

## Figures

21 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12846185/full.md

## References

44 references — full list in the complete paper: https://tomesphere.com/paper/PMC12846185/full.md

---
Source: https://tomesphere.com/paper/PMC12846185