Reflexion - Reflective Agent

AI Agent Self-Reflection Framework R APIs & Messaging

Basic Information

  • Type: AI Agent Self-Reflection Framework
  • Paper: "Reflexion: Language Agents with Verbal Reinforcement Learning" (2023)
  • Authors: Noah Shinn et al.
  • Publication: NeurIPS 2023
  • Current Status: Continuously evolving, with new variants like MAR and ERL emerging in 2025

Paradigm Description

Reflexion is an innovative framework that reinforces language agents through verbal feedback (rather than weight updates). After task failure, the agent generates verbal reflections, which are stored in an episodic memory buffer to guide better decision-making in subsequent attempts. This "verbal reinforcement learning" approach enables the agent to learn from mistakes and self-improve without gradient updates.

Core Components

  • Actor: Generates text and actions based on state observations, executes tasks, and produces trajectories
  • Evaluator: Scores the outputs and trajectories generated by the Actor
  • Self-Reflection: Generates verbal reinforcement cues to provide feedback for future attempts
  • Episodic Memory: Stores reflection texts and trajectories for use in subsequent attempts

Technical Evolution (2025-2026)

  • Multi-Agent Reflexion (MAR): Replaces single-agent self-criticism with structured debates among multiple role-based critics
  • Multiple critics generate richer reflections, more effectively guiding agent improvement
  • Process-Supervised Reflexion: Trains a unified model to explicitly follow the "generate → critique → improve" reasoning trajectory
  • Developed a dataset of 200,000 structured self-correction examples
  • Experiential Reflective Learning (ERL): Builds a reusable heuristic strategy pool by reflecting on past experiences
  • Captures effective strategies and failure patterns for efficient self-improvement

Performance

  • Significant improvement in decision-making tasks (AlfWorld)
  • Excellent performance in reasoning Q&A (HotPotQA)
  • Competitive performance in Python programming (HumanEval)

Relationship with Other Paradigms

  • vs ReAct: Reflexion adds cross-attempt self-reflection and memory on top of ReAct
  • vs CoT: CoT focuses on single-instance reasoning, while Reflexion focuses on learning across multiple attempts
  • vs Traditional RL: Reflexion uses verbal feedback instead of numerical rewards and weight updates

Relationship with the OpenClaw Ecosystem

The Reflexion paradigm is crucial for OpenClaw's individual agents. Personal agents can learn from failed tasks through the reflection mechanism, performing better in subsequent similar tasks. OpenClaw can implement an episodic memory system for agents, storing reflections and lessons from past tasks, enabling agents to continuously improve personalized service quality over time.

External References

Learn more from these authoritative sources: