MetaReflection: Learning Instructions for Language Agents using Past Reflections
Priyanshu Gupta, Shashank Kirtania, Ananya Singha, Sumit Gulwani,, Arjun Radhakrishna, Sherry Shi, Gustavo Soares

TL;DR
MetaReflection is an offline reinforcement learning method that improves language agents by leveraging past experiences to enhance performance across diverse complex tasks with fewer LLM calls.
Contribution
It introduces MetaReflection, a novel offline reinforcement learning approach that augments semantic memory for language agents, enabling performance improvements without online reflection.
Findings
Boosts agent performance by up to 16.82% over GPT-4 baseline
Performs comparably to state-of-the-art prompt optimization techniques
Requires fewer LLM calls for effective performance
Abstract
The popularity of Large Language Models (LLMs) have unleashed a new age ofLanguage Agents for solving a diverse range of tasks. While contemporary frontier LLMs are capable enough to power reasonably good Language agents, the closed-API model makes it hard to improve in cases they perform sub-optimally. To address this, recent works have explored ways to improve their performance using techniques like self-reflection and prompt optimization. Unfortunately, techniques like self-reflection can be used only in an online setup, while contemporary prompt optimization techniques are designed and tested to work on simple tasks. To this end, we introduce MetaReflection, a novel offline reinforcement learning technique that enhances the performance of Language Agents by augmenting a semantic memory based on experiential learnings from past trials. We demonstrate the efficacy of MetaReflection by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSpeech and dialogue systems · Multi-Agent Systems and Negotiation · Multimodal Machine Learning Applications
MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Multi-Head Attention · Dropout · Dense Connections
