AdaCoder: Adaptive Prompt Compression for Programmatic Visual Question Answering
Mahiro Ukai, Shuhei Kurita, Atsushi Hashimoto, Yoshitaka Ushiku,, Nakamasa Inoue

TL;DR
AdaCoder is an adaptive prompt compression framework for visual question answering that reduces prompt length by 71.1% without sacrificing performance, using a two-phase approach with a frozen LLM.
Contribution
It introduces a novel adaptive prompt compression method for VPMs that operates without additional training and is compatible with various large language models.
Findings
Reduces prompt length by 71.1%
Maintains or improves VQA performance
Works with multiple black-box LLMs
Abstract
Visual question answering aims to provide responses to natural language questions given visual input. Recently, visual programmatic models (VPMs), which generate executable programs to answer questions through large language models (LLMs), have attracted research interest. However, they often require long input prompts to provide the LLM with sufficient API usage details to generate relevant code. To address this limitation, we propose AdaCoder, an adaptive prompt compression framework for VPMs. AdaCoder operates in two phases: a compression phase and an inference phase. In the compression phase, given a preprompt that describes all API definitions in the Python language with example snippets of code, a set of compressed preprompts is generated, each depending on a specific question type. In the inference phase, given an input question, AdaCoder predicts the question type and chooses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Sparse Evolutionary Training · Cosine Annealing · Adam · Linear Layer · Byte Pair Encoding · Layer Normalization · Softmax · Dense Connections
