Sora and V-JEPA Have Not Learned The Complete Real World Model -- A   Philosophical Analysis of Video AIs Through the Theory of Productive   Imagination

Jianqiu Zhang

arXiv:2407.10311·cs.AI·July 16, 2024

Sora and V-JEPA Have Not Learned The Complete Real World Model -- A Philosophical Analysis of Video AIs Through the Theory of Productive Imagination

Jianqiu Zhang

PDF

Open Access

TL;DR

This paper philosophically analyzes current video AI systems like Sora and V-JEPA, identifying their limitations in achieving genuine world understanding based on Kantian concepts, and proposes a new framework for developing an AI with productive imagination.

Contribution

It introduces a Kantian-inspired theory of productive imagination to evaluate AI world models and proposes a novel training framework for an AI system capable of true understanding.

Findings

01

Sora lacks an a priori law of change and Kantian categories.

02

V-JEPA learns context-dependent change but misses Kantian categories.

03

Neither system fully achieves comprehensive world understanding.

Abstract

Sora from Open AI has shown exceptional performance, yet it faces scrutiny over whether its technological prowess equates to an authentic comprehension of reality. Critics contend that it lacks a foundational grasp of the world, a deficiency V-JEPA from Meta aims to amend with its joint embedding approach. This debate is vital for steering the future direction of Artificial General Intelligence(AGI). We enrich this debate by developing a theory of productive imagination that generates a coherent world model based on Kantian philosophy. We identify three indispensable components of the coherent world model capable of genuine world understanding: representations of isolated objects, an a priori law of change across space and time, and Kantian categories. Our analysis reveals that Sora is limited because of its oversight of the a priori law of change and Kantian categories, flaws that are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputability, Logic, AI Algorithms · Ethics and Social Impacts of AI