When ChatGPT for Computer Vision Will Come? From 2D to 3D

Chenghao Li; Chaoning Zhang

arXiv:2305.06133·cs.CV·May 11, 2023·1 cites

When ChatGPT for Computer Vision Will Come? From 2D to 3D

Chenghao Li, Chaoning Zhang

PDF

Open Access

TL;DR

This paper reviews the progress of deep learning in text, image, and 3D vision, discusses the evolution of AIGC, and provides an outlook on developing a ChatGPT-like model for 3D computer vision.

Contribution

It offers a comprehensive overview of deep learning advancements across modalities and explores future directions for AIGC in 3D vision, highlighting the need for a unified model.

Findings

01

Deep learning has significantly advanced NLP, image, and 3D fields.

02

AIGC is evolving from data-centric perspectives.

03

Future development of 3D AIGC requires new model architectures.

Abstract

ChatGPT and its improved variant GPT4 have revolutionized the NLP field with a single model solving almost all text related tasks. However, such a model for computer vision does not exist, especially for 3D vision. This article first provides a brief view on the progress of deep learning in text, image and 3D fields from the model perspective. Moreover, this work further discusses how AIGC evolves from the data perspective. On top of that, this work presents an outlook on the development of AIGC in 3D from the data perspective.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications