BFMD: A Full-Match Badminton Dense Dataset for Dense Shot Captioning

Ning Ding; Keisuke Fujii; Toru Tamaki

arXiv:2603.25533·cs.CV·March 27, 2026

BFMD: A Full-Match Badminton Dense Dataset for Dense Shot Captioning

Ning Ding, Keisuke Fujii, Toru Tamaki

PDF

Open Access

TL;DR

This paper introduces BFMD, a comprehensive badminton dataset with full-match dense annotations, enabling advanced tactical analysis and improved shot captioning through multimodal modeling and semantic feedback.

Contribution

The paper presents the first full-match dense badminton dataset with hierarchical multimodal annotations and a novel captioning framework utilizing semantic feedback for better accuracy.

Findings

01

Multimodal modeling improves caption quality.

02

Semantic feedback enhances semantic consistency.

03

BFMD enables tactical pattern analysis.

Abstract

Understanding tactical dynamics in badminton requires analyzing entire matches rather than isolated clips. However, existing badminton datasets mainly focus on short clips or task-specific annotations and rarely provide full-match data with dense multimodal annotations. This limitation makes it difficult to generate accurate shot captions and perform match-level analysis. To address this limitation, we introduce the first Badminton Full Match Dense (BFMD) dataset, with 19 broadcast matches (including both singles and doubles) covering over 20 hours of play, comprising 1,687 rallies and 16,751 hit events, each annotated with a shot caption. The dataset provides hierarchical annotations including match segments, rally events, and dense rally-level multimodal annotations such as shot types, shuttle trajectories, player pose keypoints, and shot captions. We develop a VideoMAE-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Multimodal Machine Learning Applications · Human Pose and Action Recognition