The Problem That Wouldn't Let Go

On March 6th, 2025, a problem started keeping me up at night.

I'd been using AI tools every day for over a year at that point. Writing code with them, thinking through architecture with them, leaning on them the way you lean on a colleague who's always available and never tired. And somewhere in that daily reliance, a pattern emerged that I couldn't ignore: every conversation started from zero.

The AI didn't know that I'd already made a decision about the database. It didn't remember that I'd corrected it three times about how I name my variables. It didn't carry forward the context from yesterday's session, or last week's, or the project I'd been grinding on for months. Every interaction was a first date.

This wasn't a new observation. Everyone who uses AI seriously has bumped into it. But what kept nagging at me wasn't the problem itself. It was the shape of the solution that nobody seemed to be building.

The Problem Nobody Was Framing Correctly

The AI industry's answer to "my AI doesn't remember anything" has been, broadly, two things: bigger context windows and bolt-on memory features.

Bigger context windows are brute force. Stuff more tokens in, hope the model figures out what matters. It works until it doesn't, and it doesn't work when you need the AI to understand that paragraph 47,000 contradicts paragraph 12,000, or that the thing you said on Tuesday replaced the thing you said on Monday.

Bolt-on memory features (ChatGPT's Memory, Claude's Projects, Cursor's rules files) are organizational metaphors dressed up as intelligence. They let you attach things. They don't understand things. There's no awareness of what's current versus stale, no propagation when facts change, no scoping for who should see what. They're filing cabinets with an AI label.

The more I sat with this, the more I realized: this isn't a machine learning problem. Nobody needs a new model architecture or a novel neural network to solve this. The problem is state management. It's scope isolation. It's consistency guarantees and conflict resolution and change propagation. It's the kind of problem that database engineers, distributed systems architects, and protocol designers have been solving for decades.

The AI industry just wasn't looking at it that way.

The Input Token Bloat Problem

There's a cost to not having memory, and it's measurable.

Every time an AI conversation starts from zero, the user has to re-establish context. Paste in the document again. Re-explain the project structure. Remind the system of decisions already made. Across the industry, this manifests as input token bloat: the growing share of every API call that's spent on context the system should already know.

The scale is staggering. Here are the mid-2026 trajectory estimates for global AI token throughput:

Metric	Daily Estimate (mid-2026)	Notes
Global AI output tokens	~176 trillion	Approaching human daily written output
Global AI input tokens	~350-700+ trillion	2-4x output in mixed workloads
Total daily tokens	~500 trillion-1 quadrillion	Several times human output scale

A significant fraction of those input tokens are waste. They're context that gets re-injected because nothing remembered it from last time. They're system prompts that grow longer every month because the only way to give AI "memory" is to cram more text into the window. They're entire documents pasted into conversations because the alternative is the AI hallucinating about what the document says.

This isn't just a cost problem (though the cost is real; input tokens aren't free). It's an architectural smell. When the dominant pattern for making AI smarter is "send more text every time," something fundamental is missing from the stack. The entire industry is paying a tax on the absence of structured memory.

Anneal's position is that the right memory architecture doesn't just improve the user experience. It reduces the input token footprint by an order of magnitude, because the system already knows what it knows. You stop paying to re-explain.

Not Machine Learning. Architecture.

I want to be clear about what this work is and what it isn't.

Anneal is not a breakthrough in machine learning. There are no training runs. No novel neural network architectures. No papers about attention mechanisms or retrieval-augmented generation. There's no GPU cluster behind it. The large language models that power the AI you already use are remarkable, and they keep getting better. This work doesn't compete with that. It builds on top of it.

What Anneal is, at its core, is solutions architecture.

It's the discipline of designing systems that manage state correctly: structured facts that evolve over time, corrections that propagate through dependency chains, scope boundaries that are cryptographically enforced rather than policy-enforced, authority hierarchies that determine which source of truth wins when two facts conflict.

These are well-understood problems. Databases solved consistency decades ago. Version control solved branching and merging. Cryptographic protocols solved verifiable computation. The creative act here wasn't inventing new primitives. It was recognizing that the space between humans and AI was missing the kind of rigorous state management that every other layer of the software stack takes for granted, and then building it.

If you've ever designed a database schema, thought carefully about cache invalidation, or debugged a race condition in a distributed system, you already understand the thinking behind this work. The novelty is in the application, not the theory.

What It Actually Does

Anneal sits between you and whatever AI you're using. It's a layer, not a replacement. Your model stays the same. Your tools stay the same. Anneal adds the memory that those tools should have had from the beginning.

Four layers of memory, each with a different purpose:

Identity is who you are in context. Your name, your role, your relationship to the work. It's stable unless you change it, and it's scoped (the AI you use at work knows your professional context; the one you use personally knows your personal context; they don't cross).

Persistent Facts are the things the AI learns about you over time. Preferences, decisions, corrections. When you tell the AI you prefer Tailwind over Bootstrap, that fact persists. When you correct it, the correction supersedes the old fact, and anything the AI previously concluded based on the old fact gets flagged for review. Facts don't just accumulate; they evolve.

Working Context is the current conversation, the active task, the thread you're in right now. It's ephemeral by design. Not everything needs to be remembered forever; some things just need to be held in focus for the moment.

Environment is the world around the conversation. Time, deadlines, external signals. The AI knows it's 2 AM and you're probably not looking for a detailed architecture review. It knows the sprint ends Friday. It knows the deployment window is in three hours.

When a fact changes, Anneal doesn't just update one record. It traces the dependency graph: what conclusions were built on that fact, what other facts referenced it, what downstream reasoning might now be wrong. Stale information doesn't quietly resurface weeks later because nobody told the system it was outdated. This is what "living memory" means. Not a list of bullet points. A graph of knowledge that maintains its own consistency.

Who This Is For

The honest answer is: everyone who uses AI. If you've ever re-explained something to a chatbot for the third time in a week, this work is relevant to you.

But the architecture was designed with harder problems in mind.

Defense engineers working in SCIFs and air-gapped environments where data physically cannot leave the room. For them, the alternative to local, self-contained AI memory is no AI memory at all. There's no cloud to phone home to. There's no API call to a memory service. The intelligence layer has to live on the machine, fully functional, completely offline.

Healthcare developers building under HIPAA, where patient data touching an AI's context window isn't a policy question; it's a legal one. The memory system needs scope-level encryption, verifiable deletion, and audit trails that would satisfy a regulator, not just a product manager.

Financial engineers under SOC2 and PCI, where the code that touches sensitive schemas can't traverse an uncontrolled network path. Architecture has to be the guarantee, not policy.

And, again, everyone else. People who just want their AI to remember that they prefer dark mode, that they already decided on the project structure, that they corrected it last Thursday and shouldn't have to correct it again.

Why "Anneal"

In metallurgy, annealing is the process of heating metal and then cooling it slowly. The heat breaks apart the existing crystal structure (the disorder, the stress, the brittleness). The slow cooling lets a new, stronger structure form. The metal becomes more durable, more flexible, more resilient. Not by adding anything to it, but by giving it the time and conditions to reorganize itself.

That metaphor is the whole philosophy of this project.

AI right now is brittle. It hallucinates because it doesn't remember what's true. It contradicts itself because it doesn't track what changed. It frustrates users because every conversation starts cold. The disorder is structural.

The fix isn't more parameters or bigger context windows. It's the patient, methodical work of giving AI the right structure for memory: consistent, evolving, scoped, verifiable. Heat and patience. That's what annealing is.

The name is also a nod to simulated annealing, an optimization technique that uses randomness and gradual cooling to find globally optimal solutions rather than getting stuck in local minima. There's something fitting about that: the willingness to explore broadly before settling into a solution, rather than optimizing the first thing that sort of works.

A Body of Work

This isn't a weekend project that turned into a startup pitch. It's a body of work that started with a question in March of 2025 and has been built, piece by piece, in the months since.

State management primitives. Scope isolation protocols. Supersession tracking (what replaced what, and what depends on what). Authority hierarchies (when two sources disagree, which one wins). Cryptographic commitments for verifiable context assembly. Tamper-evident audit trails. A benchmark suite for evaluating memory systems against each other.

Each piece informs the next. The scope isolation work revealed edge cases in supersession tracking. The authority hierarchy design changed how deletion works. The benchmark suite exposed failure modes that reshaped the core memory architecture. It's iterative, interlocking work. The kind that can't be rushed because the pieces have to fit.

I think of it as an oeuvre in the original sense of the word (I'd say "opus," but Anthropic has the trademark on that one): a body of work that coheres around a central question. The question here is simple: what should the relationship between humans and AI memory look like? Everything in this project is an attempt to answer that honestly.

This Is Not a Startup Pitch. It's an Experiment.

Let me be direct about something: this is not a fundraising deck dressed up as a blog post. There's no waitlist. There's no "request early access" button. The demo is live, right now, and it's free.

I'm paying for your tokens. And it will stay that way, because the architecture described in this post is exactly what solves the cost problem. When the system remembers what it knows, you stop paying to re-explain it every time. The input token bloat disappears. The economics work precisely because the memory works.

What I'm asking for instead of money is something more interesting: participation.

The demo supports three modes. Solo is your personal AI memory. Team is shared intelligence with a group. Global is a social experiment. Every person who enters Global mode contributes to a single, living, collective memory. It's multi-lingual (already enabled). It works in English, Spanish, French, Mandarin, whatever you speak. The AI remembers across all of it.

I want to see what happens when strangers from around the world build shared knowledge together. What does a global memory look like after an hour? After a day? After a week of people correcting each other, teaching each other, contributing what they know?

I genuinely don't know the answer. That's the point. This is the kind of question you can't answer with a whiteboard or a pitch deck. You answer it by building the thing and letting people use it.

So send this to someone. Send it to someone who doesn't speak your language. Send it to someone who thinks about AI differently than you do. The more diverse the input, the more interesting the experiment.

The Race I'm Starting on Purpose

A reasonable question: if this architecture is valuable, why publish it? Why not keep it quiet until it's locked down?

Because keeping it quiet would miss the point. Anneal has to be LLM-agnostic. It has to work with every model, every provider, every tool. The moment it gets acquired by one of the big players, it becomes another lock-in mechanism. Another way to eat your data, keep you on their platform, make switching costs unbearable. I built this for me, and then for you. Not for them.

I have no intention of selling to OpenAI, Anthropic, Google, or anyone else in that tier. Your AI. Your way. That's not a tagline; it's a design constraint.

(Of course, given OpenAI's cozy relationship with GitHub, they've theoretically had access to this code since the first commit. So, to be explicit: this work is licensed, it is not donated, and I have lawyers who know the difference. If any of this ends up inside a product without permission, we'll have that conversation in a courtroom, not a boardroom.)

Could someone recreate this work? It's not impossible. The individual pieces (state management, scope isolation, supersession tracking, cryptographic guarantees) are well-understood disciplines. But assembling them into a coherent intelligence layer that actually works across models, across tools, across languages, at production quality, with the right privacy guarantees? That's a year of interlocking architecture decisions where each piece reshapes the next. It's not a weekend project.

By putting this into the world, I've started a race. I know that. And I plan on keeping the lead.

There's a passage in Godel, Escher, Bach where Hofstadter revisits Zeno's paradox through the story of Achilles and the Tortoise. The paradox says Achilles can never overtake the Tortoise, because by the time he reaches where it was, it's already moved ahead. The resolution, of course, is that the infinite series converges: Achilles catches up and passes. But Hofstadter's deeper point is about the nature of self-referential systems and strange loops; how things that seem paradoxically uncatchable turn out to be reachable when you shift the frame of reference.

I think about that a lot. The big companies have more resources, more people, more compute. But they're also solving a different problem. They're optimizing models. They're scaling inference. They're building the Achilles. The intelligence layer above the model (the part that makes AI actually remember, actually learn, actually respect boundaries) is the Tortoise. It's a different race entirely. And while they're busy converging on the same set of model improvements, this work keeps moving.

Open Source, Copyleft, and Keeping the Lead

The plan is to open source Anneal under AGPL-3.0.

Not MIT. Not Apache. AGPL.

The distinction matters. MIT and Apache are permissive licenses; they let anyone take your work, close it up, and sell it back to you without contributing a thing. That's fine for libraries and utilities. It's not fine for an intelligence layer that's supposed to belong to its users. AGPL says: you can use this, you can modify this, you can build on this, but if you run it as a service, you share your changes. The copyleft provision ensures that improvements flow back to the community, not into a proprietary fork at some company that wants the architecture without the ethos.

Let's copyleft the theft out of our software.

As it stands today, the engine comprises over 50 base primitives across 14 layers of state, and counting. Scope isolation, supersession chains, authority hierarchies, cryptographic commitments, fact dependency graphs, multi-store views, portable memory serialization. Each primitive is tested, each layer is documented, and each piece was designed to compose with the others. This isn't a monolith; it's a toolkit, and AGPL means it stays a shared one.

I also plan on raising capital. Not to sell the company, but to keep the collective "our" lead here. (And for the record: I'm in excellent health and great spirits. Just in case any of this makes the wrong people nervous.) To hire people who care about this problem. To fund the infrastructure that keeps the demo free. To move faster on the SDK, the CLI, the self-hosted deployment story. Capital in service of the work, not the other way around.

But more than funding, what this needs is a movement.

The AI industry has a trust problem. Users hand over their data, their preferences, their corrections, their intellectual fingerprints, and they get back a terms-of-service agreement and a "trust us." This project is built on the opposite premise: trust through architecture, trust through transparency, trust through copyleft. You don't have to take my word for how your data is handled. You can read the source. You can verify the cryptographic commitments. You can run it yourself.

I want to build this community around that kind of trust. Not the trust you extend to a brand because it has nice marketing, but the trust you extend to a system because you've seen the code and it does what it says.

If any of this resonates with how you think about software, about AI, about the careful work of making complex systems behave correctly, I'd genuinely like to hear from you. This work is better when more people are building it together.

Try the demo to see living memory in action. Read about the platform architecture, our approach to privacy, or the vision for where this goes.