The AI agent stack that is working in 2026: a practical guide

May 29, 20267 min readClaudio Branno

Hermes, Obsidian vaults, Composio, Telegram, and model routing — the stack that is working for real agent businesses in 2026, not in theory.

The AI agent market moved fast in the first half of 2026. Hermes Agent went from launch to the top of the Open Router leaderboard in 90 days. Background tasks, self-improving loops, and multi-agent Kanban boards went from experimental to practical. The stack that works has crystallized.

This is that stack. Not theory. These are the tools that people running real customers on real agent businesses are using today.

Agent harness: Hermes Agent

The agent harness is the brain. It executes tasks, manages memory, interfaces with your messaging platform, and orchestrates everything else.

Why Hermes: two reasons, in order of importance.

First, reliability. OpenClaw breaks on almost every update. You spend more time fixing broken installs than using the agent. Hermes ships fewer updates, but they do not break things. Uptime is a feature. When you are managing customer environments, uptime matters more than features.

Second, the self-improving loop. After every complex task, Hermes reflects on what worked, extracts the reusable pattern, and writes a skill file. These skill files accumulate. Your agent gets better at your specific workflows over time in a way no other harness does natively.

Hermes is also model-agnostic. It works with Anthropic, OpenAI, Grok, Ollama, GLM, and local models. You are not locked into one provider. This matters because agentic workflows are expensive. Being able to route tasks to the cheapest appropriate model changes the economics significantly.

Alternative if you are already on OpenClaw: Hermes includes a migration path. It can detect an existing OpenClaw installation and import your settings, memories, skills, and API keys. The friction to switch is lower than most people assume.

Memory layer: Obsidian vault

The harness is only as useful as the context you give it. An agent with no context is a powerful tool that starts from zero every time. An agent with a well-structured memory is an employee that already knows your business.

The memory layer is an Obsidian vault — a local folder of markdown files, structured around the entities that matter for your work.

Recommended structure:

/people/ — one note per person. Name, role, context, history, how you work together.
/projects/ — one note per active project. Current status, decisions made, next steps, key stakeholders.
/clients/ — who they are, what they need, recurring context, preferences.
/workflows/ — how recurring tasks get done. Step by step. Your preferences.
/meetings/ — notes, ideally auto-synced from Granola or your note-taking tool of choice.
/preferences/ — communication style, formatting preferences, what good looks like for each type of output.

Why Obsidian over Notion: markdown files are natively readable by AI. No API dependency. No authentication complexity. Local and private. The agent can traverse linked notes and understand entity relationships. And unlike any SaaS tool, the files are yours permanently.

Connect Obsidian to Hermes by telling it the vault path. It reads from the vault before every task. It references the relevant entities without being prompted.

Connectors: Composio

An agent that can reason but cannot act on the world is useless. Composio is how the agent reaches everything else.

Composio is a single MCP server that connects to 1000+ apps — Gmail, Slack, Notion, GitHub, Google Calendar, CRM tools, and most business software. It handles authentication, manages tool calls, and means you do not spend weeks setting up individual API connections.

The value is not just the connections. It is the authentication handling. Security is the largest time sink when deploying agents for clients. Composio solves it with one integration. Every connected app is managed through their platform. You do not handle credentials directly.

What you typically connect: email (draft, reply, label), calendar (read, create events), Slack (read, send to specified channels), project management (read and update tasks), and whatever is specific to the customer's business.

Principle of least access: only give the agent the permissions it needs. Draft email but not send. Read calendar but not delete. The more conservative the permissions, the safer the setup. Expand as trust builds.

Communication interface: Telegram

You need a way to talk to your agent from anywhere. Telegram is the best interface for this, by a significant margin.

Why Telegram specifically:

Inline approval buttons: the agent asks before it acts. You approve with one tap. This keeps humans in the loop without breaking the workflow.
Tool call transparency: you can see exactly what the agent is doing, step by step. Not a black box.
Mobile-first: you can dispatch tasks from your phone on a walk. The agent works while you are away.
Active AI development: Telegram keeps adding AI-native features. The platform is building toward agents, not away from them.
Reliable API: fewer gateway crashes than competing messaging platforms. When your customer's agent goes down, you need to know fast.

Set up a watchdog: configure the agent to auto-restore if its Telegram gateway crashes. Have it email you when a cron job fails. You should not be manually checking agent health — the agent should tell you.

Model selection

The most common mistake: using your most expensive model for everything. Agentic workflows are token-intensive. If you run every task on Opus, your costs scale fast.

The routing approach:

GPT 5.5: best for most tasks. Efficient tool calls, solid output across writing, research, and general work.
Opus 4.7: for complex long-horizon coding, deep reasoning, and tasks where judgment quality matters more than cost.
Local models (Qwen 36, GLM 5.1): for lightweight, recurring, low-stakes tasks. Zero marginal cost on decent hardware.
Grok 4.3 with OAuth: for real-time X/Twitter search and research when you have an X subscription.

The rule: match the model to the task complexity, not your preference.

Task management: background tasks and Kanban

Two Hermes features changed how multi-tasking works.

/background command: assign multiple tasks to run in the background while you have a normal conversation with your agent. You can run 5 background tasks and still interact with the agent in real time.

Auto-Kanban (Tenacity release): drop a goal into the Kanban triage. Hermes breaks it into subtasks automatically and assigns them to sub-agents. You wake up to completed subtasks.

The /goal command keeps the agent anchored to a long-term objective across turns. Agents often drift toward the most recent instruction. /goal prevents that.

Putting it together: the configuration is the work

The infrastructure cost for this full stack is low. Under $100/month if you route models correctly and use local models for lightweight tasks.

The configuration cost is 2–4 weeks of setup, context loading, and testing. The Obsidian vault needs to be built. The connectors need to be tested against real scenarios. The cron jobs need to be designed around actual workflows. The skill files need to accumulate through real use.

The payoff is an agent that:

Runs 24/7 without you monitoring it
Knows your business from the vault
Can reach your email, calendar, Slack, and tools through Composio
Gets better at your specific work every week through skill file accumulation
Alerts you when something breaks before your customer notices

This is not the AI agent promise from two years ago. This is working infrastructure that people are running at scale today.

The setup takes a month. The return compounds indefinitely.