Reflexion - Reflective Agent

AI Agent Self-Reflection Framework R APIs & Messaging

Basic Information

Type: AI Agent Self-Reflection Framework
Paper: "Reflexion: Language Agents with Verbal Reinforcement Learning" (2023)
Authors: Noah Shinn et al.
Publication: NeurIPS 2023
Current Status: Continuously evolving, with new variants like MAR and ERL emerging in 2025

Paradigm Description

Reflexion is an innovative framework that reinforces language agents through verbal feedback (rather than weight updates). After task failure, the agent generates verbal reflections, which are stored in an episodic memory buffer to guide better decision-making in subsequent attempts. This "verbal reinforcement learning" approach enables the agent to learn from mistakes and self-improve without gradient updates.

Core Components

Actor: Generates text and actions based on state observations, executes tasks, and produces trajectories
Evaluator: Scores the outputs and trajectories generated by the Actor
Self-Reflection: Generates verbal reinforcement cues to provide feedback for future attempts
Episodic Memory: Stores reflection texts and trajectories for use in subsequent attempts

Technical Evolution (2025-2026)

Multi-Agent Reflexion (MAR): Replaces single-agent self-criticism with structured debates among multiple role-based critics
Multiple critics generate richer reflections, more effectively guiding agent improvement
Process-Supervised Reflexion: Trains a unified model to explicitly follow the "generate → critique → improve" reasoning trajectory
Developed a dataset of 200,000 structured self-correction examples
Experiential Reflective Learning (ERL): Builds a reusable heuristic strategy pool by reflecting on past experiences
Captures effective strategies and failure patterns for efficient self-improvement

Performance

Significant improvement in decision-making tasks (AlfWorld)
Excellent performance in reasoning Q&A (HotPotQA)
Competitive performance in Python programming (HumanEval)

Relationship with Other Paradigms

vs ReAct: Reflexion adds cross-attempt self-reflection and memory on top of ReAct
vs CoT: CoT focuses on single-instance reasoning, while Reflexion focuses on learning across multiple attempts
vs Traditional RL: Reflexion uses verbal feedback instead of numerical rewards and weight updates

Relationship with the OpenClaw Ecosystem

The Reflexion paradigm is crucial for OpenClaw's individual agents. Personal agents can learn from failed tasks through the reflection mechanism, performing better in subsequent similar tasks. OpenClaw can implement an episodic memory system for agents, storing reflections and lessons from past tasks, enabling agents to continuously improve personalized service quality over time.

External References

Learn more from these authoritative sources:

Categories

Top Skills

Topics A-I

Topics L-W

Popular Articles