# A Keyframe Extraction Method for Assembly Line Operation Videos Based on Optical Flow Estimation and ORB Features

**Authors:** Xiaoyu Gao, Hua Xiang, Tongxi Wang, Wei Zhan, Mengxue Xie, Lingxuan Zhang, Muyu Lin

PMC · DOI: 10.3390/s25092677 · Sensors (Basel, Switzerland) · 2025-04-23

## TL;DR

This paper introduces a new method for extracting keyframes from assembly line videos by combining optical flow and ORB features, improving efficiency and accuracy.

## Contribution

A novel keyframe extraction method that adapts to motion intensity using optical flow and ORB features for assembly line videos.

## Key findings

- The method achieves an 85.2% recall rate for keyframe extraction.
- It processes an average of 274 frames per second with high efficiency.
- Over 90% recall is achieved for actions involving minimal movement.

## Abstract

In modern manufacturing, cameras are widely used to record the full workflow of assembly line workers, enabling video-based operational analysis and management. However, these recordings are often excessively long, leading to high storage demands and inefficient processing. Existing keyframe extraction methods typically apply uniform strategies across all frames, which are ineffective in detecting subtle movements. To address this, we propose a keyframe extraction method tailored for assembly line videos, combining optical flow estimation with ORB-based visual features. Our approach adapts extraction strategies to actions with different motion amplitudes. Each video frame is first encoded into a feature vector using the ORB algorithm and a bag-of-visual-words model. Optical flow is then calculated using the DIS algorithm, allowing frames to be categorized by motion intensity. Adjacent frames within the same category are grouped, and the appropriate number of clusters, k, is determined based on the group’s characteristics. Keyframes are finally selected via k-means++ clustering within each group. The experimental results show that our method achieves a recall rate of 85.2%, with over 90% recall for actions involving minimal movement. Moreover, the method processes an average of 274 frames per second. These results highlight the method’s effectiveness in identifying subtle actions, reducing redundant content, and delivering high accuracy with efficient performance.

## Full-text entities

- **Genes:** FASTK (Fas activated serine/threonine kinase) [NCBI Gene 10922] {aka FAST}
- **Diseases:** oFAST (MESH:D013736), injury to (MESH:D014947)
- **Chemicals:** AKPA (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12074371/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12074371/full.md

## References

34 references — full list in the complete paper: https://tomesphere.com/paper/PMC12074371/full.md

---
Source: https://tomesphere.com/paper/PMC12074371