Apps that update themselves: what Codex Sites shows about the future of software

June 9, 20266 min readClaudio Branno

Codex Sites is not a website builder. It is a model for software that agents can operate. Safe actions, persistent memory, and skills that let future agents update live applications without human involvement.

Every software product today has a maintenance burden. Content goes stale. Data changes. Features need updating. Someone has to open the CMS, the codebase, or the admin panel and make it happen. The human is always in the loop because the product cannot operate itself.

Codex Sites is a different model. The apps update themselves.

This is not a headline claim. It is the concrete capability that Greg Isenberg demonstrated on June 4 — building an internal product in six prompts, adding persistent storage, exposing safe actions, creating a reusable skill, and then proving the loop: invoking a command from a new chat thread and watching the live application update in real time.

What Codex Sites actually is

Codex Sites is OpenAI's internal app builder, built on top of the Codex environment. It is not a visual website builder like Lovable or Bolt, and not a hosted deployment platform like Replit. It is a capability inside Codex: if you live in the Codex ecosystem, you can build and host internal apps directly within it, and those apps can be modified autonomously through the same agent you use for everything else.

The key distinction: Codex Sites apps are designed to be operated by agents, not just built by them. The goal is a product that agents can keep running, updating, and modifying on your behalf.

The five concepts that make it work

Memory: persistent storage. Without it, the app resets on every visit. You have to prompt for it explicitly — Cloudflare D1 is the default option. Without memory, you have a static interface with no state.
Safe actions: the approved mutations an agent can make to the application. Instead of arbitrary database access, you define a bounded set of named operations: add idea, move card, update score. The agent can only invoke those specific operations.
Skills: reusable instruction sets that tell the agent how to operate the application in future sessions. Without a skill, a new chat thread has no context for how the app works. With a skill, it can operate the application correctly from the first message.
Save gates: version checkpoints before deployment. Codex does not auto-save. A named checkpoint is the equivalent of a git commit — a durable restore target if something breaks.
Proving the loop: the validation test — opening a new chat thread, invoking the skill, and watching the agent update the live application. If everything is set up correctly, the agent modifies the product autonomously without developer involvement.

The real limitations

Codex Sites has meaningful limitations right now. No custom domains — apps deploy to auto-generated URLs. The setup is more technical than Lovable or Replit, requiring understanding of authentication, databases, and how to prompt for memory and safe actions explicitly. And apps are currently internal, not designed for public deployment.

These limitations are likely to close over time. Custom domains, easier storage setup, and smoother public deployment are predictable product improvements.

What the architecture implies

The more important observation is not about the current limitations. It is about what the model implies.

The paradigm of human-maintained software assumes that someone is always responsible for keeping the product current. The paradigm that Codex Sites is beginning to demonstrate is different: a product structured so that an agent can operate it. The agent can add content, update entries, move items, respond to new information, and modify the application based on rules you defined — without a human opening an editor.

For operators building AI-first products, the implication is architectural. If you build for human operation, you create a maintenance burden. If you build for agent operation — with explicit safe actions, skills, and memory — you create a product that compounds without requiring your direct involvement.

Where to focus now

Two practical moves for operators and builders watching this space.

First, understand the difference between a product built by an agent and a product operated by an agent. Building is the familiar part. Operating is the new part: defining safe actions, writing skills, building in memory, and proving that a future agent can work with the product without your involvement.

Second, identify one internal tool where autonomous operation would create the most leverage. Something that currently requires a human to update on a schedule: a status dashboard, a content board, a client tracker. That is the right place to start applying this model.

The tools are early. The patterns are becoming clear. The operators who start building for agent operation now will be running a different class of product by the end of 2026.