Transcribing Educational Videos Using Whisper: A preliminary study on using AI for transcribing educational videos
Ashwin Rao

TL;DR
This study evaluates Whisper's effectiveness in transcribing educational videos, highlighting its potential to improve e-learning accessibility while identifying key research challenges in applying ASR technology.
Contribution
It provides a preliminary assessment of Whisper's performance on educational videos and discusses open research questions in using ASR for this purpose.
Findings
Whisper can generate transcripts for educational videos.
There are notable open challenges in applying ASR to educational content.
The study offers insights into future research directions for AI-based transcription.
Abstract
Videos are increasingly being used for e-learning, and transcripts are vital to enhance the learning experience. The costs and delays of generating transcripts can be alleviated by automatic speech recognition (ASR) systems. In this article, we quantify the transcripts generated by whisper for 25 educational videos and identify some open avenues of research when leveraging ASR for transcribing educational videos.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSubtitles and Audiovisual Media · Online Learning and Analytics
