From Imitation to Intuition: Intrinsic Reasoning for Open-Instance Video Classification

Ke Zhang; Xiangchen Zhao; Yunjie Tian; Jiayu Zheng; Vishal M. Patel; Di Fu

arXiv:2603.10300·cs.CV·March 30, 2026

From Imitation to Intuition: Intrinsic Reasoning for Open-Instance Video Classification

Ke Zhang, Xiangchen Zhao, Yunjie Tian, Jiayu Zheng, Vishal M. Patel, Di Fu

PDF

1 Repo 1 Models

TL;DR

This paper introduces DeepIntuit, a framework that enhances open-instance video classification by evolving from imitation to intrinsic reasoning, leveraging reinforcement learning and calibration for better generalization.

Contribution

DeepIntuit is a novel approach that combines supervised reasoning initialization, reinforcement learning refinement, and calibration to improve open-instance video classification.

Findings

01

DeepIntuit outperforms traditional models on open-instance tasks.

02

Intrinsic reasoning improves generalization over imitation-based methods.

03

The approach effectively transfers knowledge without distribution mismatch.

Abstract

Conventional video classification models, acting as effective imitators, excel in scenarios with homogeneous data distributions. However, real-world applications often present an open-instance challenge, where intra-class variations are vast and complex, beyond existing benchmarks. While traditional video encoder models struggle to fit these diverse distributions, vision-language models (VLMs) offer superior generalization but have not fully leveraged their reasoning capabilities (intuition) for such tasks. In this paper, we bridge this gap with an intrinsic reasoning framework that evolves open-instance video classification from imitation to intuition. Our approach, namely DeepIntuit, begins with a cold-start supervised alignment to initialize reasoning capability, followed by refinement using Group Relative Policy Optimization (GRPO) to enhance reasoning coherence through…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://bwgzk-keke.github.io/DeepIntuit
github

Models

🤗
BWGZK/DeepIntuit
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.