Optimizing GPT for Video Understanding: Zero-Shot Performance and Prompt   Engineering

Mark Beliaev; Victor Yang; Madhura Raju; Jiachen Sun; Xinghai Hu

arXiv:2502.09573·cs.CV·April 29, 2025

Optimizing GPT for Video Understanding: Zero-Shot Performance and Prompt Engineering

Mark Beliaev, Victor Yang, Madhura Raju, Jiachen Sun, Xinghai Hu

PDF

Open Access

TL;DR

This paper presents a novel prompt engineering approach to enhance GPT's zero-shot video classification performance, demonstrating significant improvements through policy simplification and a decomposition-aggregation technique, without additional model fine-tuning.

Contribution

It introduces a new decomposition-aggregation prompt engineering method and demonstrates how prompt optimization can improve GPT's zero-shot video classification performance.

Findings

01

Simplifying policies reduces false negatives.

02

Decomposition-aggregation prompts outperform traditional methods.

03

Prompt engineering significantly enhances GPT's performance.

Abstract

In this study, we tackle industry challenges in video content classification by exploring and optimizing GPT-based models for zero-shot classification across seven critical categories of video quality. We contribute a novel approach to improving GPT's performance through prompt optimization and policy refinement, demonstrating that simplifying complex policies significantly reduces false negatives. Additionally, we introduce a new decomposition-aggregation-based prompt engineering technique, which outperforms traditional single-prompt methods. These experiments, conducted on real industry problems, show that thoughtful prompt design can substantially enhance GPT's performance without additional finetuning, offering an effective and scalable solution for improving video classification.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Image Processing Techniques · Machine Learning and Data Classification