ChatBCG: Can AI Read Your Slide Deck?
Nikita Singh, Rob Balian, Lukas Martinelli

TL;DR
This paper evaluates GPT-4o and Gemini Flash-1.5's ability to accurately interpret data in business slide decks, revealing current limitations in reading complex or unlabeled charts end-to-end.
Contribution
It provides a systematic assessment of state-of-the-art multimodal models' capabilities in reading and interpreting business slide deck charts, highlighting their current shortcomings.
Findings
Models correctly read 7-8 out of 15 labeled charts.
Performance drops significantly with unlabeled or complex charts.
Current models are not yet reliable for end-to-end slide deck comprehension.
Abstract
Multimodal models like GPT4o and Gemini Flash are exceptional at inference and summarization tasks, which approach human-level in performance. However, we find that these models underperform compared to humans when asked to do very specific 'reading and estimation' tasks, particularly in the context of visual charts in business decks. This paper evaluates the accuracy of GPT 4o and Gemini Flash-1.5 in answering straightforward questions about data on labeled charts (where data is clearly annotated on the graphs), and unlabeled charts (where data is not clearly annotated and has to be inferred from the X and Y axis). We conclude that these models aren't currently capable of reading a deck accurately end-to-end if it contains any complex or unlabeled charts. Even if a user created a deck of only labeled charts, the model would only be able to read 7-8 out of 15 labeled charts perfectly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · Cosine Annealing · Layer Normalization · Linear Layer · Weight Decay · Softmax · Multi-Head Attention · Dense Connections
