Synthesizing the Expert: A Validated Multimodal Dataset for Trustworthy AI-Assisted Swimming Coaching
Ahmad Al-Kabbany, Esraa Kassem

TL;DR
This paper introduces a multimodal, validated dataset for trustworthy AI-assisted swimming coaching, leveraging a multi-agent LLM framework to synthesize high-quality data for improved AI reliability in sports science.
Contribution
It presents a novel generative framework that creates a structured, synthetic dataset for swimming analysis, addressing data scarcity and ethical challenges in AI sports applications.
Findings
Synthesized 1,864 validated question-context-answer triplets.
Established a foundational benchmark for trustworthy AI in aquatics.
Demonstrated the effectiveness of a multi-agent LLM architecture.
Abstract
This research is primarily concerned with the critical problem of synthesizing a structured Retrieval-Augmented Generation (RAG) system for advanced AI applications in the domain of swimming. As the integration of Artificial Intelligence in sports science matures, its applications in swimming have become increasingly diverse, spanning from real-time technical coaching and talent scouting to comprehensive performance profiling and the dynamic personalization of training periodization. Within this landscape, RAG-based systems represent a pivotal advancement in Large Language Model (LLM) enhanced swimming analysis, as they allow for the grounding of generative outputs in authoritative domain knowledge, thereby ensuring the credibility of AI-generated advice, contextually and technically. Despite this potential, building robust RAG systems using only real-world aquatic data presents…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
