LEARNet: A Learning Entropy-Aware Representation Network for Educational Video Understanding

Chitrakala S; Nivedha V V; Niranjana S R

PMC · DOI:10.3390/e28010003·December 19, 2025

LEARNet: A Learning Entropy-Aware Representation Network for Educational Video Understanding

Chitrakala S, Nivedha V V, Niranjana S R

PDF

Open Access

TL;DR

LEARNet is a new framework that efficiently identifies key instructional content in educational videos by focusing on high-information frames.

Contribution

LEARNet introduces an entropy-aware architecture combining Temporal Information Bottleneck and Spatial–Semantic Decoder for educational video understanding.

Findings

01

LEARNet reduces visual redundancy by 70.2% while maintaining high annotation accuracy (F1 = 0.89, mAP@50 = 0.88).

02

The framework enables the creation of EVUD-2M, a large-scale benchmark with multi-level semantic labels for educational videos.

Abstract

Educational videos contain long periods of visual redundancy, where only a few frames convey meaningful instructional information. Conventional video models, which are designed for dynamic scenes, often fail to capture these subtle pedagogical transitions. We introduce LEARNet, an entropy-aware framework that models educational video understanding as the extraction of high-information instructional content from low-entropy visual streams. LEARNet combines a Temporal Information Bottleneck (TIB) for selecting pedagogically significant keyframes with a Spatial–Semantic Decoder (SSD) that produces fine-grained annotations refined through a proposed Relational Consistency Verification Network (RCVN). This architecture enables the construction of EVUD-2M, a large-scale benchmark with multi-level semantic labels for diverse instructional formats. LEARNet achieves substantial redundancy…

Figures20

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Human Pose and Action Recognition