My AI can now learn from its own mistakes

I came across a video about Hermes, an AI agent with self-write skills that lets it learn and improve over time. It got me curious: could my own AI agent do something similar?

So I asked.

The question I asked

“I understand Hermes agent has self-improvement and learning loops. Can I create an agent who watches over your steps and, after you’re done, writes an MD file and sends it to you?”

What my AI agent came back with

It explained that it already does a basic version of self-improvement — updating its own memory file when it notices something or when I correct it. But it was honest about the gaps:

Capability	My AI agent — before	My AI agent — now	Hermes
Independent evaluator	❌ self-graded only	✓ separate Reviewer agent	✓
Runs after every session	❌ only when I noticed	✓ automated debrief	✓
Cross-session pattern tracking	❌ no memory across sessions	✓ after 5+ sessions	✓
Permanent weight updates	❌ config file only	❌ config file only	✓ some versions

The two things the Reviewer agent can’t fix: Hermes can generate genuinely new skills, and some versions update model weights permanently. My agent’s improvements live in a config file, not the model itself.

But the suggestion was practical: create a Reviewer agent that receives a session debrief, writes a structured analysis, and sends paste-ready improvement notes back. That closes the most important gaps — independent evaluation and cross-session pattern tracking.

What I built

I said yes. The Reviewer agent was spun up on the spot:

Receives session debriefs
Identifies errors, missed context, what worked
Writes a structured review saved as a dated markdown file
Sends paste-ready improvement notes back
After 5+ sessions, synthesizes patterns across all reviews

What makes this meaningful

The self-grading problem is real. When an AI reviews its own work, it tends to justify the choices it made. Reviewer is a separate agent with no stake in the outputs — it calls out errors without bias.

It’s not Hermes. The loop is semi-manual and there are no weight updates. But the corrections land in the file that shapes every future session. That’s real, compounding improvement.

The question I’m watching: does the same class of error show up in session 2?

The meta-observation

I didn’t design the Reviewer. I asked if something like it was possible, and my AI agent proposed the architecture, explained the tradeoffs honestly, and built it when I said yes. The most interesting part of working with AI isn’t when it does what you asked — it’s when it tells you what’s actually missing.