One memory for AI agents: Claude Code, Codex and Hermes in the same context // DRAGOCZ

AI tools can write code, refactor projects, analyze logs and handle long technical conversations. The problem is not capability anymore. The problem is memory.

Claude Code does not automatically know what Codex solved. Codex does not know what Hermes learned. And when you switch between tools all day, you keep explaining the same things again: what the project is, how you work, what is in progress, what must not break and how the result should be verified.

That was the annoying part. So I built a layer that works as shared memory for my AI agents.

What I wanted to solve

This was not about building another notes app. I wanted memory that agents actually use while working.

If one tool learns a repository rule, the next one should not rediscover it from scratch. If Hermes knows that I prefer short reports and no markdown tables in Discord, Codex should not ignore that. If a project runs over several days, memory should carry the useful context across sessions, not just across one chat.

The goal was simple:

one memory for multiple agents,
separated contexts by project and use case,
no mixing of customer data,
fast retrieval of relevant facts,
visibility into what the system actually knows.

Shared memory instead of isolated chats

The principle is simple: agents do not only produce answers, they also produce usable context.

Claude Code can discover a rule in a repository. Codex can finish a technical refactor. Hermes can capture a preference or operational note in Discord. All of that goes into one memory layer, where it becomes small, searchable facts.

The next task does not load the entire history. That would be slow and messy. Instead, the system pulls only the memories that matter: project conventions, preferences, last known state, important constraints, verification steps.

That is the difference between an agent that starts from zero every day and an agent that remembers how work is done here.

Stack underneath

Publicly put: it runs on a custom memory service built around Mem0, a Python API and vector search. On top of that sits a thin integration layer used by agents and the dashboard.

In practice:

AI tools    → Claude Code, Codex, Hermes
API layer   → Python / FastAPI
Memory      → Mem0
Retrieval   → vector database
History     → relational storage + change audit
UI          → dashboard for memory, categories and activity

The important part is not that there is a database. The important part is that the memory has structure. It is not a pile of random notes glued into a prompt.

Memories are separated by categories and use cases. Some belong to development projects, some to preferences, some to operations, some to specific customer channels. That keeps the system clean and, more importantly, isolated: what belongs to one project or customer flow does not leak into another.

Why isolation matters

Shared memory is dangerous if implemented badly. And that concern is valid.

I do not want an agent working on one project to pull context from another customer area. I do not want customer conversations to become general development memory. And I do not want internal operational details to end up in public writing.

That is why the memory is separated by domains. Development preferences live apart from customer notes. Project state is not the same type of information as communication style. The agent gets only the context that makes sense for the current task.

That, to me, is the difference between “AI remembers something” and memory that is actually usable in production work.

Dashboard as a brain map

The screenshot above is a visualization of the memory layer. Not because you need to stare at a pretty graph every time an agent works. But with a system that remembers things, it matters to see what is happening inside.

The map shows nodes, categories, agents, activity and relationships between memories. You can quickly see which areas are active, what is used often and where clusters are forming. It is a bit like observability for context.

Without that, memory is a black box. With it, you can debug it.

What changed in practice

The biggest difference is not technical. It is how the work feels.

When I switch between tools now, I do not have to explain everything again. The agent knows my preferences, active projects, expected output style and the rules it should respect. Not perfectly — human review still matters — but enough that work no longer starts from zero every time.

Typical examples:

Codex works on code while respecting project conventions.
Claude Code can continue with the same context without a manual briefing.
Hermes knows how to write reports in Discord and what not to mix together.
Long-running projects keep useful state across days.

This is exactly the type of infrastructure that does not look like a feature at first. But once you have it, you do not want to go back.

A small step toward a personal operating system

I see this as another part of my own working OS.

Not one chatbot. Not one editor. More like a set of agents with shared memory, shared rules and the ability to continue each other’s work.

AI tools without memory are fast but forgetful. AI tools with shared memory start to feel like a team.

And that is where it gets interesting.