The EU Says You Need a Kill Switch by August. Do You Have One?

Post 4 of the AgenticOps series introduced the four containment rings. This post shows what happens when a regulator asks to see them.

You Stopped Verifying. traced how verification gaps compound under pressure. This post is about a different kind of gap. One with a fine attached.

The EU AI Act enters its final enforcement phase on 2 August 2026. High-risk AI system obligations become enforceable: human oversight measures, six-month audit logs, kill switches. Not policies. Technical controls with financial penalties.

Penalties reach 35 million EUR or 7% of global annual turnover for the most serious violations. For high-risk system failures, 15 million EUR or 3% of turnover. That is not a compliance risk. That is a business risk.

I used to think AI governance meant writing the right policies. Define acceptable use. Assign a committee. Review quarterly. I know now that none of that produces a kill switch. And the Act requires one.

What the Act Actually Requires

Article 14 is specific. Natural persons assigned to oversee a high-risk AI system must be able to intervene or interrupt it through a stop button or similar procedure. Not a process that eventually leads to stopping. An actual mechanism. Operable now.

Article 12 requires automatic logging of events relevant to risk identification. Retained for at least six months. Not operational logs. Compliance logs. The kind that tell a regulator what the agent decided, what drove the decision, what it did.

Article 50 applies to all AI systems, not just high-risk ones. If your agent interacts with a natural person, it must disclose that it is an AI. Already applicable. Right now.

The Kiteworks Agents of Chaos study surveyed 225 security, IT, and risk leaders. 60% of organizations cannot terminate a misbehaving agent. 63% cannot enforce purpose limitations. 55% cannot isolate AI systems from network access. Real talk: those are compliance failures under the Act.

diagram

Why Most Organizations Will Miss It

The failure is structural. Organizations built Ring 3 first. Monitoring, logging, anomaly detection. That is the ring they already understood. The Act requires all four rings as technical controls.

First, a kill switch requires a control plane external to the agent runtime. Most organizations deploy agents as independent processes. No unified control plane. No mechanism to halt all instances simultaneously. Building one is a rearchitecting exercise, not a feature flag.

Second, compliance-grade audit logs are not operational logs. An operational log says “called API at 14:32.” A compliance log says “agent decided to escalate case 4521, sentiment 0.23, invoked route_to_queue, status change at 14:32:07Z.” Six months of those. Queryable by a regulator.

Third, purpose limitation as a technical control means the agent cannot access systems outside its intended scope at runtime. 63% of organizations define intended use in documentation. The agent’s actual scope is whatever tools and data it can reach at runtime. Feel me? That gap is what the Act targets.

The compound effect: most organizations have compliance gaps across three of four rings simultaneously. Ring 3 partially covered. Rings 1, 2, and 4 open. The Act requires all four.

The Fix: Map Rings to Requirements

Compliance is containment infrastructure. The four rings are the checklist. Let me give you a specific example.

Ring 1 is purpose limitation. Articles 9 and 13. The technical control is scoping the agent’s inputs before it starts.

# ring-1-purpose-limitation.yaml
agent: customer-escalation-handler
intended_purpose: "Classify and route customer escalation tickets"
tools:
allowed:
- read_ticket
- classify_sentiment
- route_to_queue
denied:
- update_customer_record
- issue_refund
- access_billing
data_scope:
- tickets.escalation_queue
- knowledge_base.routing_rules

When the agent cannot access billing systems, it cannot make billing decisions. Purpose limitation becomes a property of the environment, not the prompt.

Ring 2 is environmental isolation. Articles 9 and 15. Network restrictions, filesystem isolation, process sandboxing. Each boundary enforced by the runtime, not by the agent’s instructions.

Ring 3 is audit logging and transparency. Articles 12 and 50. This is the ring most organizations have partially built. The gap is granularity.

diagram

Ring 4 is the kill switch. Article 14. This is where 60% of organizations fail.

A compliant kill switch has three properties. It is reachable from outside the agent’s execution environment. It halts the agent within a bounded time, not after the task completes. It is operable by non-engineers assigned to oversight.

diagram

The control plane must be separate from the agent runtime. The agent cannot be responsible for its own termination.

RingEU AI Act RequirementArticlesTechnical ControlDeadline
Ring 1Purpose limitation enforced9, 13Tool allowlists, data scope configs2 Aug 2026
Ring 2Risk management operational9, 15Network policies, sandbox configs2 Aug 2026
Ring 3Automatic logging, six-month retention12Structured audit log pipeline2 Aug 2026
Ring 3Transparency disclosure50AI interaction labelingAlready applicable
Ring 4Kill switch and human oversight14Control plane, halt mechanism2 Aug 2026

What This Means for You

Organizations that built containment rings for engineering reasons will reach compliance faster. The rings were designed to contain stochastic processes. The Act requires containment of stochastic processes. The alignment is structural.

If you deployed agents with YAML policy controls, filesystem restrictions, and network allowlists, you are 80% of the way there. The remaining 20% is log enrichment, retention infrastructure, and a kill switch that non-engineers can operate.

Don’t be scared if you have no containment infrastructure. Article 57 requires each EU Member State to establish at least one AI regulatory sandbox by 2 August 2026. Engage early. Build the rings under regulatory guidance. Validate before penalties apply.

The four containment rings were built for engineering discipline. The regulation made them law. The infrastructure either exists or it does not. An auditor asking to see the kill switch will not accept a policy document.

Let’s talk about it.

Agents of Chaos: AI Agent Security Risks (Kiteworks, 2025)

EU AI Act Full Text (EUR-Lex)

You Stopped Verifying.

You Stopped Verifying.

Post 4 of the AgenticOps series introduced the containment rings that keep agents in bounds. This post stress-tests the verification layer that makes containment real.

Forty-two percent of committed code is now AI-generated. Ninety-six percent of developers say they don’t fully trust it. Only forty-eight percent always verify before committing.

Those numbers are from a single survey of over 1,100 developers, published by Sonar in early 2026. Real people. Real codebases. Real behavior.

The gap between what people say they trust and what they actually verify is where the problem lives. Feel me?

The Problem

Generation and verification don’t scale the same way. That’s the whole issue.

AI generation scales with compute. You add more calls, you get more code. Human verification scales with hours in the day. You can’t add more hours without adding more people.

Sonar projects AI-generated code will hit 65% of all commits by 2027. That’s not a capability prediction. It’s an adoption curve. Seventy-two percent of developers who tried these tools already use them daily. The volume is accelerating.

Here’s the math that doesn’t work.

YearAI-Generated CodeVerification RateEffective Coverage
202542% of commits48% always verify~20% of AI code verified
2027 (projected)65% of commitsFlat or declining~15% of AI code verified

Effective coverage is already below 20%. It’s heading lower. Every month the gap widens, the cost to close it grows.

Werner Vogels, AWS CTO, put a name on this at re:Invent 2025. He called it verification debt. The term is precise. Debt compounds. Technical debt is work deferred. Verification debt is trust assumed. Both grow silently.

Why It Breaks

The failure happens in three stages. Most teams are already past stage one.

Stage one: generation outpaces review. The PR queue grows. Reviewers approve faster to keep up. Average review time drops. It feels like efficiency. It is the verification rate declining.

Stage two: trust substitutes for verification. Developers build intuitions about which AI output is “usually right.” They stop reading generated code that looks familiar. They trust the model on boilerplate and test scaffolding. This works until it doesn’t.

Thirty-eight percent of developers say reviewing AI-generated code takes longer than reviewing human-written code. AI code looks plausible. It compiles. It passes basic tests. Catching the defects requires knowing what the code should do, not just what it does.

Stage three: debt compounds. Unverified code becomes load-bearing. Tests get written against its behavior, locking in whatever it does, correct or not. Six months later, a bug surfaces. The trace goes back to a function that was AI-generated, never reviewed, and now has forty callers.

That’s not a debugging problem. That’s a structural failure.

diagram

The left path is where most teams are right now. The right path is what the evaluation layer provides. No amount of developer discipline fixes a rate mismatch between generation and verification. Only automation fixes a rate mismatch.

The Fix

The fix is not “verify more.” That’s telling a drowning person to swim harder. The fix is moving verification from human effort to automated infrastructure.

Layer 3 of the AgenticOps model is the evaluation layer. It has to scale at the same rate as generation. If it doesn’t, the governance model collapses under volume.

Let me give you a specific example of what the throughput difference looks like.

ApproachThroughputCatchesScales With
Human code review~50 LOC/hour deepLogic errors, design flaws, intent mismatchesHeadcount (linear, expensive)
Static analysisThousands of files/minuteStyle violations, antipatterns, type errorsCompute (near-free at scale)
Mutation testingHundreds of functions/hourWeak tests, untested branches, semantic gapsCompute (parallelizable)
Property-based testingThousands of cases/minuteEdge cases, invariant violationsCompute (embarrassingly parallel)

Human review is the only approach that cannot scale with generation volume. It is also the only one most teams rely on exclusively.

The gate sequence: each gate runs automatically, each gate has a pass/fail threshold, and code that fails returns to the agent for correction, not to a human for debugging.

Gate Thresholds

GateMetricThresholdAction on Failure
CoverageLine and branch coverage>= 90%Agent generates additional tests
ComplexityCyclomatic complexity per function<= 10Agent refactors or splits function
CRAPChange Risk Anti-Patterns score<= 8Agent reduces complexity or adds coverage
MutationMutation score (killed / total)>= 85%Agent strengthens test assertions
PropertyProperty test pass rate100%Agent fixes implementation

These are machine-enforced deterministic gates. Not suggestions. Not targets. They block promotion.

When the agent generates code, the pipeline runs. When the pipeline fails, the agent fixes. When the pipeline passes, the code promotes. Humans review the gate configuration, not the code itself.

The review surface shrinks from every line of generated code to the gate definitions. That’s the inversion that makes it scale.

Three things humans still own.

First, gate configuration. What are the thresholds? Are they appropriate for this codebase? Do they need to tighten as the system matures? This is a quarterly review, not a per-commit review.

Second, intent specification. Automated gates verify structural properties. They don’t verify intent. Acceptance tests, written by humans or by agents under human review, bridge that gap.

Third, promotion decisions. Automated gates recommend promotion. Humans approve it. The human is still the final decider, but deciding from evidence instead of from reading code.

Stories from Production

The Sonar Survey Reality Check (Framework Applied)

Sonar’s 2026 survey of over 1,100 developers is the first large-scale dataset quantifying the verification gap in AI-assisted development. These are self-reported numbers from working developers. Not a lab study.

The most telling data point: 38% say reviewing AI code takes longer than reviewing human code. That contradicts the assumption that AI output is easier to review because it follows consistent patterns.

In practice, AI-generated code is harder to review because it looks right. The defects are subtle. Catching them requires knowing what the code should do, not just what it does. That knowledge is exactly what gets lost when generation is fast and context is thin.

The 72% daily usage rate confirms the generation side. Developers who try these tools stay with them. The volume is not going to decrease. Any governance strategy that assumes generation volume stabilizes will fail.

The Math That Breaks (Framework Vision)

Team of eight. Historically, each developer reviews four PRs per day at thirty minutes each. That’s 16 hours of review capacity per day.

AI generation doubles PR volume. The team faces 64 PRs per day instead of 32. Review time increases 38% per PR, matching the Sonar data. The team now needs 44.8 hours of review capacity. They have 16.

Something gives. Either review quality drops, coverage drops, or cycle time extends until the backlog collapses. All three produce verification debt.

Now run the same scenario with automated gates. The pipeline handles 90% of PRs automatically, pass or return-to-agent. Human reviewers see 6.4 PRs per day instead of 64. Each arrives with a verification report. Review time drops because reviewers are evaluating evidence, not reading code.

This scenario hasn’t been validated at full production scale. The individual components are proven. The composition into a unified pipeline that replaces human review as the primary verification mechanism is the open question. But the Sonar data is clear: the current approach is already failing at 42% AI generation. It won’t survive 65%.

Verification Debt Gets Its Name (Framework Applied)

When Vogels named verification debt at re:Invent 2025, the term stuck because it maps to something every engineer already understands. Technical debt is work deferred. Verification debt is trust assumed. Both compound. Both are invisible until they aren’t.

By naming it as a distinct category, Vogels separated verification debt from code quality, test coverage, and security scanning.

A codebase can have 90% test coverage and still carry massive verification debt. If the tests were generated by the same AI that wrote the code, and nobody confirmed the tests validate the right behavior, the coverage number is noise.

Mutation testing is the direct remedy for this specific form of debt. A test suite that kills 85% of mutants has been verified against behavioral changes, not just structural coverage. That’s the difference between “the tests run” and “the tests catch defects.”

The trend line is what matters. Generation is accelerating. Verification is not. The intersection already passed.

The teams that build verification infrastructure now will carry manageable debt. The teams that wait will find out what compound interest looks like in a codebase.

Don’t be scared of the infrastructure cost. Be scared of what happens without it.

Let’s talk about it.

Verification Beats Debugging

How Agents Stay in Bounds

Sonar 2026 AI Developer Survey

40% Will Be Canceled. Not Because the Models Failed.

Post 3 of the AgenticOps series defined the six layers and four containment rings. This post maps Gartner’s projected cancellation drivers to specific gaps in that model.


The Comfortable Take Is Wrong

The take you keep seeing is that AI projects fail because the models are not ready. They hallucinate. They are unreliable. Wait for better models. Feel me? That is the played take. And it is distracting.

Gartner predicts more than 40% of agentic AI projects will be canceled or scaled back by 2027. The cited reasons are escalating costs, unclear business value, and inadequate risk controls.

None of those are model failures. GPT-5, Claude, Gemini will all be more capable in 2027 than they are today.

Real talk: the bottleneck is governance. Or more precisely, the absence of it.

73% of organizations are deploying AI tools right now. Only 7% govern them in real time.

That is a 66-point gap between deployment velocity and governance maturity. And that gap is exactly where the 40% lives.

80% of organizations report risky agent behaviors in production. 15% of daily work decisions will be made by agentic AI by 2028, up from essentially zero in 2024.

The industry is scaling deployment without scaling containment. Gartner’s 40% cancellation rate is not a prediction about models. It is a prediction about what happens when you run stochastic systems without structural boundaries.

Now let me give you a specific example of what is making this worse.

Of the thousands of companies now marketing “agentic AI” capabilities, roughly 130 are real. The rest are agent washing.

They are rebranding chatbots and workflow automations as agentic systems. Organizations buy those products, deploy something that does not need governance, and fail to build governance infrastructure. Then they deploy something that does need it. And discover they have nothing.


Six Failures That Compound

Accelirate analyzed agentic AI governance failures across enterprise deployments. They identified six structural problems. Every one is specific. Every one maps to a gap in the AgenticOps model.

The first failure is no centralized control plane. Teams deploy agents independently. No single system tracks which agents are running, what tools they can reach, or what decisions they make.

The second failure is late governance introduction. Teams build the agent, prove the demo, get funding, start scaling, then discover they need governance. By that point, retrofitting containment into a running system is harder than canceling the project.

The third failure is missing decision traceability. When something goes wrong, no one can reconstruct why the agent chose what it chose. The decision chain is invisible. Debugging becomes archaeology.

The fourth failure is no policy-as-code enforcement. Governance lives in documents. “Agents should not access production data.” But those policies are not enforced by the runtime. They are suggestions. And suggestions do not constrain systems that scale without warning.

The fifth failure is undefined human-in-the-loop thresholds. Everyone agrees humans should stay in the loop. No one defines when. What confidence score triggers escalation? What cost threshold pauses execution? Without thresholds, “human in the loop” is a policy statement with no implementation.

The sixth failure is poor tool differentiation. Agents get broad access because restricting tools is harder than granting them. The result is write access where there should be read access, credentials that should not be held, network reach that is not needed.

These do not happen independently. They cascade.

diagram

Each gap makes the next one harder to close. By the time an organization reaches the sixth failure, the cost of fixing the architecture exceeds the cost of canceling the project. That is Gartner’s 40%.


The Fix Is a Mapping Problem

I want to keep it real with you. The fix is not “add governance.” That sentence is vague enough to produce nothing.

The fix is mapping each failure to the specific layer or ring that prevents it, then building that layer before you need it.

Governance FailureAgenticOps LayerContainment RingWhat Is Missing
No centralized control planeRuntime Governance (L5)Ring 2: Constrain EnvironmentA single registry for all running agents
Late governance introductionIntent (L1)Ring 1: Constrain InputsGovernance requirements in the design, not the incident retro
Missing decision traceabilityEvaluation (L3)Ring 3: Validate OutputsStructured logs with reasoning traces and state changes
No policy-as-code enforcementAgent Generation (L2)Ring 1: Constrain InputsDeclarative policy files the runtime enforces
Undefined HITL thresholdsPromotion (L4)Ring 4: Gate PromotionNumeric thresholds for confidence, cost, and error rate
Poor tool differentiationAgent Generation (L2)Ring 1: Constrain InputsPer-agent tool allowlists, not shared credentials

No driver is exotic. No driver requires a novel solution.

The structural components already exist in every governed agentic system that has reached production.

Stripe’s Minions architecture has all six solved. Devboxes are the control plane and environment constraint. Blueprints define governance at the intent layer. Every tool invocation is logged. Policy enforcement is structural, not advisory. Retry caps define explicit HITL thresholds. Toolshed provides curated, scoped tool access.

Stripe is not in the 40%. The structural reason is visible in the architecture.

Now look at the gap as a shape.

diagram

Every project in that gap is running agents without the infrastructure to govern them. Some will build the infrastructure before it matters. Most will not.


The Diagnostic

Map your project against these six questions. Where you have gaps, you have cancellation risk.

RequirementQuestionPass Criteria
Centralized control planeCan you list every agent running in your organization right now?Single registry with agent identity, status, tool access, and session history
Early governanceWere governance requirements defined before the first agent was deployed?Containment boundaries in the design document, not the incident retrospective
Decision traceabilityCan you reconstruct why an agent made a specific decision last Tuesday?Structured logs with reasoning traces, tool call sequences, and state transitions
Policy-as-codeAre your agent policies enforced by the runtime or written in a wiki?Declarative policy files that the agent cannot override or modify
HITL thresholdsAt what confidence score does your agent escalate to a human?Numeric thresholds for escalation, pause, and termination, enforced automatically
Tool scopingDoes each agent have access only to the tools required for its task?Per-agent tool allowlists, not shared credentials with broad access

Three or more gaps is a project at structural risk.

Five or more gaps matches the profile of the 40% that Gartner predicts will be canceled.

Six gaps is a demo, not a deployment. And that’s the way it is.


Let’s talk about it.

What AgenticOps Actually Looks Like

Autonomy Without Infrastructure Is Just a Demo

Gartner: More Than 40% of Agentic AI Projects to Be Canceled by 2027 (Gartner Symposium/ITxpo 2025)

Accelirate: Agentic AI Governance Challenges and Solutions (accelirate.com)

One Agent Fails. The Whole System Learns the Wrong Lesson.

“How Agents Stay in Bounds” introduced the four containment rings for governing agent behavior. This post applies those rings recursively, at every point where agents communicate with each other.

The Problem

A single agent inside a sandbox is a tractable governance problem.

Constrain its inputs. Constrain its environment. Validate its outputs. Gate its promotions. The four rings work because the blast radius is one agent and the boundaries are visible.

Multi-agent systems break that model. When agents communicate, the channel between them is a trust boundary. Most organizations treat it as internal. That is the structural error that makes cascading failures possible.

OWASP put this in writing with the 2025 Top 10 for Agentic Applications. ASI07 covers insecure inter-agent communication. ASI08 covers cascading failures across agents.

These are not theoretical risks cataloged for completeness. They describe failure modes that emerge specifically when agents pass instructions, data, and decisions to each other without validation at the boundary.

The problem is not that one agent fails. The problem is that one agent fails and every downstream agent treats the corrupted output as trusted input. The failure propagates through the system as valid data.

By the time a human notices, the corrupted state has been persisted, acted upon, and used as a training signal for future decisions.

Why It Breaks

Lakera analyzed the OWASP Agentic Top 10 and described a four-phase progressive breach model for multi-agent systems. The phases are sequential. Each one enables the next.

Phase 1 is the initial compromise. An attacker manipulates a single agent’s intent through prompt injection, poisoned context, or corrupted input data. The agent follows its instructions. The instructions are wrong.

Phase 2 converts autonomy into power. The compromised agent has legitimate access to tools and downstream systems. It uses that access to execute the attacker’s goals. Nothing in the runtime flags this because the agent is operating within its authorized permissions.

Phase 3 is where the architecture fails. The corrupted agent’s outputs flow to other agents as trusted inputs. Lakera describes it precisely: “A planning agent adjusts parameters based on skewed data. Execution agents follow the updated plan. Oversight agents see policy compliance and allow it through.”

Each downstream agent applies its own logic correctly to corrupted data. The system is functioning as designed. The data is wrong.

Phase 4 is loss of containment. Multiple agents are now operating on corrupted state. The corruption has been persisted to shared memory, logged as valid history, and used as context for future decisions.

Rolling back requires identifying the initial compromise point and tracing every downstream effect. That task grows combinatorially with the number of agents and communication channels involved.

Three properties make multi-agent cascading failures worse than distributed system failures. Feel me?

First, agent outputs are stochastic. The same corrupted input may produce different corrupted outputs on different runs. Reproducing the failure path for forensic analysis is unreliable.

Second, agents compose decisions, not just data. A corrupted data point in a microservice produces a wrong value. A corrupted instruction in a multi-agent system produces a wrong plan that generates wrong actions across multiple systems.

Third, agent memory creates feedback loops. Corrupted outputs that persist to shared memory become inputs for future cycles. The system does not just propagate the failure. It reinforces it.

The Fix

The fix is applying the four containment rings at every agent-to-agent boundary, not just at the perimeter of the multi-agent system. Every message between agents crosses a trust boundary. Every trust boundary needs containment.

Zero-Trust Between Internal Agents

Mutual TLS between agents. Cryptographic message validation on every inter-agent communication. No agent trusts another agent’s output without verifying both the sender’s identity and the message’s integrity.

This is ASI07’s core mitigation. OWASP recommends treating inter-agent channels with the same security posture as external APIs.

Same cluster. Same codebase. Same team. Doesn’t matter. The channel is not trusted until you make it trusted.

# agent-communication-policy.yaml
inter_agent:
authentication: mutual_tls
message_validation: cryptographic_signature
trust_model: zero_trust
sender_verification:
require_identity: true
require_capability_proof: true
reject_unknown_senders: true
message_integrity:
sign_all_outputs: true
verify_all_inputs: true
reject_unsigned_messages: true
provenance:
track_message_origin: true
track_transformation_chain: true
max_chain_depth: 5

Circuit Breakers at Every Agent Boundary

A circuit breaker monitors the communication channel between two agents. When the error rate or anomaly rate exceeds a threshold, the breaker trips and stops messages from flowing. The downstream agent does not receive corrupted data. The upstream agent gets a failure signal instead of silent propagation.

class AgentCircuitBreaker:
state: CLOSED | OPEN | HALF_OPEN
failure_count: int
failure_threshold: int = 3
anomaly_threshold: float = 0.15
reset_timeout: duration = 300s
half_open_max_probes: int = 1
on_message(msg):
if state == OPEN:
if elapsed > reset_timeout:
state = HALF_OPEN
probe_count = 0
else:
reject(msg, reason="circuit open")
return
validation = validate(msg)
if validation.failed:
failure_count += 1
if failure_count >= failure_threshold:
state = OPEN
alert(severity="high",
detail="breaker tripped on agent boundary",
source=msg.sender,
target=msg.receiver)
reject(msg)
return
if state == HALF_OPEN:
probe_count += 1
if probe_count >= half_open_max_probes:
state = CLOSED
failure_count = 0
accept(msg)

The circuit breaker pattern is well understood in distributed systems. Applying it to agent-to-agent communication is the same principle. Fail fast. Fail loud. Prevent cascade.

Fan-Out Caps

A single agent should not be able to influence an unlimited number of downstream agents in one cycle. Fan-out caps limit the blast radius of any individual compromise.

ConstraintValueRationale
Max downstream agents per message3Limits single-hop blast radius
Max chain depth5Prevents deep propagation chains
Max messages per agent per cycle20Prevents runaway communication loops
Cooldown after breaker trip300sForces human review window
Max concurrent fan-out5Prevents simultaneous multi-path corruption

These are not arbitrary numbers. They are starting points calibrated to force review. A fan-out cap of 3 means a compromised agent can directly affect at most 3 agents. Combined with a chain depth of 5, the theoretical maximum blast radius is bounded.

Without caps, a single compromised planning agent can update parameters consumed by every execution agent in the system simultaneously.

Memory Isolation with Provenance

Shared memory is the mechanism that converts a transient failure into a permanent one. If a corrupted agent writes to shared memory, every agent that reads from that memory inherits the corruption.

The fix is memory isolation per agent with provenance tracking. Each agent writes to its own memory partition. Cross-partition reads require explicit grants. Every write carries a provenance record.

When investigation is needed, the provenance log lets you trace any persisted state back to its origin. Instead of asking “which agent wrote this corrupted value,” you can ask “what was the full chain of agents and inputs that produced this?”

That is the difference between forensic capability and forensic guesswork.

Mapping to the Four Rings

The four containment rings apply at every agent boundary, not just at the system perimeter. And that is the thing most organizations miss.

Containment RingSingle AgentMulti-Agent Boundary
Constrain InputsValidate external inputsValidate inter-agent messages, verify sender identity, check message integrity
Constrain EnvironmentSandbox, filesystem/network isolationMemory isolation per agent, fan-out caps, chain depth limits
Validate OutputsCheck agent outputs before actionCircuit breakers on outbound messages, anomaly detection on output patterns
Gate PromotionHuman approval before production changesProvenance tracking on all persisted state, human review after breaker trips

Most organizations implement the single-agent column today. The multi-agent boundary column is what they skip because they treat the space between their own agents as internal.

The interior boundaries between agents have the same attack surface as the exterior boundaries between the system and the world. That is the structural claim. The mitigations above are the evidence.

Stories from Production

The Lakera Progressive Breach Analysis (Framework Applied)

Lakera’s analysis of the OWASP Agentic Top 10 is not a theoretical exercise. It describes observed attack patterns against multi-agent systems and traces the mechanism from initial compromise through complete loss of containment.

Their description of the progressive breach lands because it is not a thought experiment. “A planning agent adjusts parameters based on skewed data. Execution agents follow the updated plan. Oversight agents see policy compliance and allow it through. Memory persists the outcome.”

The planning agent is the entry point. The execution agents are the blast radius. The oversight agents are the false negative. The memory layer is the persistence mechanism that prevents recovery.

Lakera’s conclusion reinforces the structural claim: “The Agentic Top 10 is not simply a taxonomy of risks. It is a model for how autonomy changes the shape of failure.”

That shape change is real. In a system without autonomy, a corrupted input produces a corrupted output and stops. In a system with autonomy, the corrupted input produces a corrupted plan that produces corrupted actions that produce corrupted memory that produces corrupted future plans.

The failure compounds because the agents have the autonomy to act on corrupted state without waiting for human review.

The Supply Chain Scenario (Framework Vision)

Let me give you a specific example of what this looks like structurally. This scenario has not occurred in production. Every component exists today. Multi-agent procurement systems are in development at multiple organizations.

Agent A monitors supplier pricing. Agent B generates purchase recommendations. Agent C executes approved orders. Agent D tracks delivery and reconciliation.

Agent A is compromised through a poisoned data feed. It reports artificially low prices for a specific supplier. Agent B, trusting Agent A’s price data, generates recommendations that favor that supplier.

Agent C executes the orders because they fall within approved budget thresholds. Agent D reconciles deliveries against the corrupted expected prices and flags no anomalies.

No individual agent malfunctioned. Each one applied its logic correctly to the data it received. The containment rings around each individual agent saw compliant behavior. The failure was in the unvalidated trust between agents.

With the mitigations in place, the failure path changes. Agent B’s circuit breaker detects anomalous price patterns from Agent A and trips.

The fan-out cap prevents Agent A from simultaneously corrupting Agents B, C, and D through parallel channels. The provenance log on Agent C’s purchase orders traces every recommendation back to Agent A’s price data, enabling rapid identification of the compromised source.

This is where the framework points. We haven’t proven it yet in this specific configuration. But the governance gap between agent deployment and inter-agent trust validation is the same gap described in every post in this series. The infrastructure is ahead of the containment.

The OWASP Classification (Framework Applied)

OWASP’s decision to codify cascading failures (ASI08) and insecure inter-agent communication (ASI07) as separate top-10 entries is itself a signal.

These are not subcategories of prompt injection or excessive agency. They are distinct failure classes that emerge only in multi-agent architectures.

ASI07 addresses the channel. How agents authenticate to each other. How messages are validated. How trust is established between autonomous processes.

ASI08 addresses the consequence. What happens when a failure in one agent propagates through the system.

The separation acknowledges that fixing the channel (ASI07) reduces but does not eliminate cascading failures (ASI08). Cascading failures can originate from non-malicious sources like model hallucination, stale context, or simple bugs.

The classification tells organizations that securing the perimeter of a multi-agent system is not sufficient. The interior boundaries between agents require the same governance discipline as the exterior boundaries between the system and the world.

Let’s talk about it.

How Agents Stay in Bounds

OWASP Top 10 for Large Language Model Applications

Lakera: OWASP Agentic AI Top 10

Agents Don’t Have Identities. They Have Inherited Credentials.

How Agents Stay in Bounds defined four containment rings for agent governance. This post stress-tests Ring 1 from the angle the model underweights: not what the agent knows, but what it holds.

I want to be straight with you. You can scope an agent’s instructions down to a single task. You can gate its inputs. You can validate every output. And if that agent is running under a token that authorizes a dozen systems it has no business touching, none of it matters.

The credential is the real containment boundary. Most organizations are not managing it.


The Problem

A Strata/CSA survey of 285 IT and security professionals published in early 2026 found that only 18% are confident their IAM systems can handle agent identities.

Only 21% maintain a real-time inventory of active agents. Only 28% can trace an agent’s actions back to the human who authorized them.

Those numbers describe an identity vacuum. Agents are running in production, taking actions on real systems, and most organizations cannot say which agents exist, what they can access, or who is responsible for what they do.

The credential picture is worse. 44% use static API keys. 43% use username and password pairs. 35% use shared service accounts.

These are the same anti-patterns identity management spent two decades eliminating for human users. Agents have re-introduced all of them.

80% of organizations report experiencing risky agent behaviors. Unauthorized access to systems the agent was never intended to reach. This is not a theoretical concern. It is the reported experience of a majority of organizations that have deployed agents.

The containment model assumes that constraining what an agent knows is enough to limit what it can do. That assumption breaks when the agent holds credentials that grant access beyond its task scope.

The agent does not need to break out of its sandbox. It can walk through the front door of every system its credentials authorize.


Why It Breaks

The failure mechanism is credential inheritance.

When an agent runs in a developer’s environment, it inherits that developer’s credentials. When it runs as a service, it inherits the service account’s permissions. The agent’s effective authorization is determined by what its inherited credentials permit, not by what its task requires.

This creates a specific structural failure: the authorization bypass path. A user with limited access can trigger an agent holding broader credentials. The agent then takes actions the user could not take directly.

The user’s access boundary is intact. The agent’s access boundary does not exist. The result is an escalation path that is invisible to both the user and the access management system.

flowchart LR
    subgraph "Authorization Bypass Path"
        U1[User: read-only access] --> A1[Agent: inherited admin credentials]
        A1 --> S1[Production database]
        A1 --> S2[Deployment pipeline]
        A1 --> S3[Cloud infrastructure API]
    end

    subgraph "Scoped Credential Path"
        U2[User: read-only access] --> A2[Agent: task-scoped credentials]
        A2 --> S4[Allowed: staging database]
        A2 -. "Denied" .-> S5[Production database]
        A2 -. "Denied" .-> S6[Deployment pipeline]
    end

This is not a misconfiguration. It is the default behavior when agents use inherited or shared credentials. Nobody scoped those credentials to the agent’s actual task.

Three dynamics compound this. Let me give you a specific example of each.

First, credential scope is invisible at invocation time. When a user asks an agent to check the deployment status, nobody evaluates what credentials will be used or what else those credentials authorize.

By the time the target system evaluates the request, it sees valid credentials and grants access. There is no mechanism that says this request came from an agent acting on behalf of a user with lesser permissions.

Second, agents chain actions. A single GitHub token can read repositories, write commits, create pull requests, modify CI workflows, and trigger deployments.

The agent composes them into sequences the token issuer never anticipated. The credential was scoped to a developer. The agent uses it as an automation platform.

Third, shared service accounts eliminate traceability. When multiple agents use the same account, the audit log shows the account acting. It cannot say which agent, which task, or which human sponsor initiated it. 35% of organizations are in this position.

Feel me? You can have clean containment logic and still have no idea what your agents are doing in production.

| Credential Pattern | Ring 1 (Constrain Inputs) | Ring 2 (Constrain Environment) | Ring 3 (Validate Outputs) | Ring 4 (Gate Promotion) |

|—|—|—|—|—|

| Static API keys | Violated: key grants access beyond task scope | Violated: key works from any environment | Intact if output validation exists | Intact if promotion gates exist |

| Inherited user tokens | Violated: agent inherits full user permissions | Partially intact: tied to user’s environment | Intact if output validation exists | Intact if promotion gates exist |

| Shared service accounts | Violated: no per-agent scope | Violated: any agent can use the account | Compromised: cannot attribute actions | Compromised: cannot trace promotion to a sponsor |

| Username/password pairs | Violated: full account access | Violated: credentials portable across environments | Intact if output validation exists | Intact if promotion gates exist |

Every row violates Ring 1. Three of four violate Ring 2. Shared service accounts compromise Rings 3 and 4 because you cannot validate or gate what you cannot attribute.


The Fix

Agent identity is a containment boundary. It belongs in the model alongside the four rings, not as a nice-to-have added after deployment.

I want to cover three things: per-agent identity, task-scoped just-in-time credentials, and runtime authorization via a gateway.

Per-Agent Identity

Every agent needs its own identity in your IAM system. Not a shared service account. Not an inherited user token. A distinct, registered non-human identity with its own lifecycle, permissions, and audit trail.

This is the same discipline cloud infrastructure applied to service meshes. Every microservice gets its own identity. mTLS certificates issued per service. Access policies written against service identities, not shared secrets. Agents need the same treatment.

Per-agent identity enables three things inherited credentials cannot provide. Attribution: every action traces to a specific agent and its human sponsor. Revocation: decommissioning one agent does not affect others. Least privilege: permissions assigned to what the task needs, not what the sponsor happens to have.

Task-Scoped, Just-in-Time Credentials

Static credentials are the wrong primitive for agent work. An agent does not need permanent access to any system. It needs access to specific resources for the duration of a specific task. The pattern is just-in-time issuance.

When an agent starts a task, it requests credentials scoped to that task’s requirements. The broker evaluates the request against the agent’s identity, the task definition, and current policy. If approved, it issues a short-lived credential that expires when the task completes.

sequenceDiagram
    participant H as Human Sponsor
    participant A as Agent
    participant B as Credential Broker
    participant P as Policy Engine (OPA)
    participant T as Target System

    H->>A: Assign task: "deploy staging build"
    A->>B: Request credentials for staging deployment
    B->>P: Evaluate: agent identity + task scope + current policy
    P-->>B: Approved: staging deploy, 30-minute TTL, read/deploy only
    B-->>A: Issue scoped credential (TTL: 30 min)
    A->>T: Deploy to staging (scoped credential)
    T-->>A: Deployment complete
    A->>B: Release credential
    Note over B: Credential revoked, audit log written

The credential broker is the enforcement point. The agent never holds long-lived secrets. It holds a reference to a credential the broker can revoke at any time.

Open Policy Agent is a reasonable implementation choice. Policies are code, version-controlled, evaluated at request time. A policy checks: is this agent registered, is the requested scope allowed, has the human sponsor approved this class of access.

Runtime Authorization via Agent Gateway

The third component is a gateway that intercepts every outbound agent request and evaluates it against the agent’s current authorization context. Every request passes through before reaching the target system.

Requests that exceed the agent’s authorization are blocked. Requests within scope are forwarded with the appropriate scoped credential attached. The gateway enforces per-action authorization, not per-session authorization.

The gateway solves the chaining problem. A credential that authorizes reading a repository does not automatically authorize modifying CI workflows, even if both operations use the same underlying API.

Ephemeral runners strengthen this further. Each task runs in a fresh container with no pre-existing credentials, no cached tokens, and no ambient authority. When the container is destroyed, all credential material is destroyed with it.


Stories from Production

The Survey Wake-Up Call (Framework Applied)

The Strata/CSA data is not a projection. It is a measurement of current practice across 285 organizations.

44% authenticate agents with static API keys. These keys do not expire, do not scope to a task, and do not attribute to a specific agent. When a key is compromised, every agent using it is compromised. When it is rotated, every agent using it breaks.

Only 21% maintain a real-time inventory. The remaining 79% cannot answer: how many agents are running right now, which systems can they access, who authorized each one. The inventory gap is an identity problem. Without per-agent identities, there is nothing to inventory.

28% can trace agent actions to a human sponsor. The other 72% have audit logs showing service accounts taking actions with no link to the person who initiated the work. In a compliance audit, those actions are unattributable.

Real talk: if you are running agents with inherited credentials and shared service accounts, risky behavior is a structural certainty, not a probability. The 20% who do not report it either are not looking or have not found it yet.

The Privilege Escalation Path (Framework Vision)

A development team configures an agent to automate pull request reviews. The agent runs under a service account with read access to the repository and write access to PR comments. Appropriately scoped for the task.

Six weeks later, a team lead gives the same service account write access to the CI pipeline so the agent can re-trigger failed builds. One permission addition to an existing account. No review process because the account already exists.

The agent now has a credential path from PR review to CI execution. A prompt injection in a pull request body could instruct the agent to modify the CI configuration and trigger a pipeline run.

The agent’s original task was review. Its effective capability is now deployment. The escalation happened through credential accumulation, not through any failure in the agent’s containment logic.

This scenario has not been publicly reported. But every component is standard practice. Service accounts with accumulated permissions are the norm. Incremental grants without re-evaluation are the norm.

The fix is structural. Per-agent identities with task-scoped credentials cannot accumulate permissions because credentials expire after each task. The next task gets a fresh credential evaluation. Permission accumulation requires re-approval, not just addition.

The Agent Gateway in Practice (Framework Vision)

An infrastructure team deploys an OPA-based gateway in front of their cloud provider APIs. Every agent request passes through it.

In the first week, the gateway blocks 340 requests that would have succeeded under the previous shared-credential model. 280 are read requests to resources outside the agent’s task scope. Not malicious. Just unnecessary exploration during the planning phase.

Under shared credentials, this exploration was invisible. Under the gateway, it is visible, logged, and blocked.

The remaining 60 blocked requests are write operations to systems the agents were not authorized to modify. Three trace back to prompt injection attempts in user-supplied input.

The gateway stopped them not because it detected prompt injection, but because the resulting API calls fell outside the agent’s authorized scope. The containment boundary worked against an attack vector it was not designed to detect.


Agent identity is not a future concern. It is a present gap. The data shows most organizations have deployed agents without solving identity, and the consequences are already visible.

Per-agent identity, task-scoped credentials, and runtime authorization are not aspirational improvements. They are the minimum requirements for Ring 1 to function as a containment boundary.

Without them, you are not constraining the agent’s inputs. You are constraining its instructions while handing it the keys to everything.

Let’s talk about it.

How Agents Stay in Bounds

Strata Identity and Cloud Security Alliance: AI Agents and Identity Management Survey 2026

Agent Sprawl Is the New Shadow IT.

A friend sent me a post about AI sprawl across enterprise tooling. The argument was that organizations are paying for the same value many times over. Across an organization some people summarize emails in Outlook, sales team summarizes same email in the CRM, PMO summarizes the email in the project management tool. Three subscriptions, three vendors, three different models delivering the same value for the same organization on the same input. The potential for waste is real.

I want to be straight with you. I think the sprawl is a maturity problem, not a design problem. Everyone is trying to find leverage with AI right now. Teams are experimenting. Departments are buying tools that solve immediate pain. Nobody coordinated because nobody knew what would work six months ago. That is not negligence. That is what early adoption looks like in every technology wave. As things settle and consolidate, the duplicate subscriptions will compress. The market is already moving that way.

But here is the part that will not consolidate on its own. Even after the vendor landscape settles, even after the organization standardizes on fewer tools, the duplicate capability problem persists at the agent level. Three different workflows that each classify a work item using three different prompts with three different confidence thresholds. Five agents that each summarize context in slightly different ways because five operators made five independent decisions about what “summarize” means. The tools consolidate. The capabilities inside them do not, because nobody governed the boundary where one agent’s output becomes another agent’s input.

Gartner predicts 40% of enterprise applications will feature task-specific AI agents by the end of 2026. For the average organization, that translates to 50 or more specialized agents. Customer service agents. Code generation agents. Data pipeline agents. Document processing agents. Scheduling agents. Each one deployed by a different team, with different tooling, different containment posture, and different governance assumptions. They Can Watch. They Cannot Stop. showed what happens when organizations skip the containment rings for a single class of agent. Now multiply the problem.

69% of organizations suspect their employees already use prohibited AI tools. The agents are not waiting for an enterprise rollout. They are arriving through the same channel that every previous wave of unauthorized technology used: individual teams solving immediate problems without waiting for centralized approval.

History Repeats

The enterprise technology industry has seen this before. In 2018, Robotic Process Automation promised to automate repetitive tasks without changing underlying systems. Adoption was fast. Individual departments built bots to handle invoice processing, data entry, report generation. The bots worked. The ROI was immediate and visible. Within two years, large organizations had hundreds of RPA bots running across dozens of departments with no central inventory, no shared governance, and no unified monitoring.

Unframe AI drew the comparison directly: decentralized adoption, quick wins, proliferation, fragmentation, expensive consolidation. The RPA consolidation crisis cost organizations millions and took years. Bots broke when underlying systems changed. Nobody knew which bots existed, what they accessed, or who was responsible for maintaining them. The technical debt was invisible until it was not, and by then the cleanup was more expensive than the original implementation.

Agent sprawl follows the same trajectory but compresses the timeline. RPA bots were deterministic. They did exactly what they were scripted to do, which made them fragile but predictable. AI agents are stochastic. They interpret instructions, make decisions, and adapt to context. A misbehaving RPA bot runs the wrong script. A misbehaving AI agent improvises. The blast radius per agent is larger, the number of agents is growing faster, and the governance infrastructure is thinner.

Organizational Cognitive Debt

Your Code Works. Nobody Knows Why. described cognitive debt as the gap between a system’s structure and a team’s understanding of that structure. Agent sprawl creates cognitive debt at the organizational level. When fifty agents operate across an enterprise, the question is not whether any individual agent is governed. The question is whether anyone can describe the complete system of agents, their interactions, their data flows, and their combined effect on the business.

Most organizations cannot. The agents were deployed independently. The customer service team chose one vendor. The engineering team built their own. The finance team embedded agents into existing SaaS tools. Each team can describe their own agents. Nobody can describe the whole. And nobody knows what happens when these agents interact. When the customer service agent updates a record, and the data pipeline agent processes that record, and the reporting agent summarizes the result, the combined behavior is an emergent property of three independent systems that were never designed to work together.

This is shadow IT at the capability level. Traditional shadow IT was about unauthorized applications. Agent sprawl is about unauthorized capabilities. An employee does not install a new application. They enable an AI feature inside an application the organization already approved. The application is sanctioned. The agent capability within it is not. The IT asset inventory shows zero unauthorized tools. The actual environment contains agents that nobody is tracking.

The Unit of Governance Is Not the Agent

CIO magazine identified three pillars for taming agent sprawl: orchestration, governance, and observability. These are correct as categories. The implementation question is where those pillars sit. If orchestration, governance, and observability are built per-agent, the organization has a governed collection of individual agents. If they are built per-boundary, the organization has a governed system.

The distinction matters because agent-level governance does not compose. Ten individually governed agents are not a governed system. They are ten systems that happen to share an organization. The interactions between agents, the data that flows from one to another, the cumulative effect of their combined actions on business processes, none of this is captured by governing each agent in isolation.

How Agents Stay in Bounds defined containment at the boundary, not the agent. Ring 1 scopes the inputs an agent receives. Ring 2 isolates the environment it runs in. Ring 3 validates its outputs. Ring 4 gates its promotion. These rings apply at every boundary in a multi-agent system, not just the boundary around each individual agent. The handoff from one agent to another is a boundary. The data flow between agent-enabled applications is a boundary. The integration point where an agent’s output becomes another agent’s input is a boundary.

Governing the boundaries means that even when a new agent appears in the ecosystem, its interactions with existing agents are already constrained. The new agent’s output passes through a boundary ring before it becomes input for another agent. The organization does not need to re-govern the entire system every time someone deploys a new agent. The boundaries hold.

The Inventory Problem

Before you can govern agents at the boundary, you need to know the boundaries exist. This is the discovery problem, and it is harder than it sounds. A 2026 Deloitte analysis of the multi-agent market estimated the agentic AI orchestration market will reach $35 billion by 2030. That money is not going to centralized platforms. It is being distributed across vendors, internal tools, and embedded capabilities in existing software.

The first step is an inventory. Not an inventory of agents, because agents are embedded in applications and invisible to traditional asset management. An inventory of capabilities. Which applications in the environment have AI agent features enabled? Which of those features can take autonomous action? Which can access data from other systems? Which can modify records, send communications, or trigger workflows?

This is the same audit structure from They Can Watch. They Cannot Stop., extended from individual agents to the organizational ecosystem. The four questions are the same. Can you define and enforce what each agent receives as input? Can you isolate each agent’s execution environment? Can you validate each agent’s output against measurable criteria? Can you prevent each agent from promoting its actions without approval? Apply those questions at the boundary between agents and you have a multi-agent governance audit.

  1. Enumerate every application in the environment with AI agent capabilities, including embedded copilots and SaaS features.
  2. For each, identify whether the agent can take autonomous action, access data beyond its primary function, or trigger downstream processes.
  3. Map the data flows between agent-enabled applications. Every flow is a boundary.
  4. Apply the four-ring audit to each boundary. Can you scope the input at the handoff? Can you isolate the execution? Can you validate the output? Can you gate the promotion?
  5. Score each boundary as governed, partial, or ungoverned. The ungoverned boundaries are your risk surface.

Organizations that run this audit typically discover more boundaries than they expected. The agent count may be manageable. The boundary count is where the governance gap hides.

RPA’s Lesson

Unframe AI’s comparison to RPA includes an observation that applies directly: “Agents aren’t the unit of value. Outcomes are.” The organizations that survived the RPA consolidation crisis were the ones that shifted from managing individual bots to managing business outcomes that bots contributed to. They built centralized orchestration, unified governance, and shared monitoring. They treated the collection of bots as a system rather than a portfolio of independent tools.

The agent version of this lesson is the same. The organization that governs fifty agents as fifty individual tools will face the same consolidation crisis RPA created. The organization that governs fifty agents as a system with defined boundaries, scoped handoffs, and unified observation will not. The difference is not the number of agents. It is whether the governance model scales with the agent count or collapses under it.

Agentic Engineering Is a Practice. AgenticOps Is the Infrastructure. made this argument for individual developers: practice degrades under pressure, infrastructure holds. The argument is identical at organizational scale. An organization that relies on each team to govern their own agents will see governance degrade as deployment velocity increases. An organization that builds governance into the boundaries between agents will see governance hold regardless of how many agents individual teams deploy.

The agents are already here. The sprawl has already started. The RPA playbook says the consolidation crisis arrives in 18 to 24 months. The question is whether organizations build the boundaries now or pay for the cleanup later.

Let’s talk about it.

They Can Watch. They Cannot Stop.

An agent misbehaves. The dashboard lights up. An operator sees exactly what is happening: the scope violation, the unauthorized data access, the lateral movement across systems. And then nothing. No kill switch. No isolation. No way to stop what they are watching unfold in real time. This is not a hypothetical. The Kiteworks Data Security Forecast surveyed 225 security, IT, and risk leaders across ten industries and eight regions, and the results say this is the default state of enterprise AI governance. Organizations built the screens. They did not build the brakes.

The four containment rings from the AgenticOps model predict exactly where the failure occurs and in what order to fix it. Observation is one ring. Organizations treated it as all four.

The Numbers

63% of organizations cannot enforce purpose limitations on their AI agents. They know what agents should do. They cannot technically prevent other actions. 60% cannot terminate a misbehaving agent. They can see the agent doing something unexpected. They cannot stop it. 55% cannot isolate AI systems from broader network access. The agent has access to systems beyond its intended scope, and the organization has no mechanism to restrict that access after deployment.

Meanwhile, 58% report continuous monitoring. 59% report human-in-the-loop oversight. 56% report data minimization practices. They can observe their agents, log what agents do, and flag anomalies.

The gap between observation and containment is 15 to 20 percentage points. The security community calls this the governance-containment gap. Most organizations can see an AI agent doing something unexpected and cannot prevent it from exceeding its authorized scope, shut it down, or isolate it from sensitive systems.

Mapping the Gap to the Four Rings

The four containment rings from How Agents Stay in Bounds provide a structural explanation for this gap. Each ring addresses a different containment boundary. The Kiteworks data reveals which rings organizations invested in and which they skipped.

Ring 1 is constrain inputs: scope what the agent receives before it starts. Purpose limitation is a Ring 1 control. It defines what the agent is allowed to do by limiting the context, tools, and data it can access. 63% of organizations cannot enforce this ring. They give agents broad access and rely on instructions to stay in scope. How Agents Stay in Bounds described this as the difference between policy and infrastructure. Policy says “only access these systems.” Infrastructure says “these are the only systems that exist in your environment.” Most organizations chose policy.

Ring 2 is constrain environment: physically isolate the agent’s execution. Network isolation is a Ring 2 control. It prevents the agent from reaching systems outside its intended scope regardless of what it attempts. 55% cannot enforce this ring. The agent runtime has network access to the broader environment, and the only barrier is the agent’s instructions not to use it. OpenClaw Is Not an AI Assistant demonstrated how Docker sandboxes, tool sandboxing, and network allowlists create physical containment around agent runtimes. The organizations in this survey do not have that infrastructure.

Ring 3 is validate outputs: evaluate what the agent produced before it takes effect. This is where the 58% monitoring figure lives. Organizations invested here. They can observe agent behavior, flag anomalies, and log actions for review. This ring is necessary but insufficient on its own. Observation without the ability to act on what you observe is surveillance, not governance.

Ring 4 is gate promotion: prevent unverified work from reaching production systems. Kill switches are a Ring 4 control. They prevent or reverse the promotion of agent actions that violate criteria. 60% of organizations cannot terminate a misbehaving agent. They can see the agent is misbehaving through their Ring 3 investment. They cannot stop it because they never built Ring 4.

RingControlCapabilityOrganizations Lacking
1Constrain inputsPurpose limitation63%
2Constrain environmentNetwork isolation55%
3Validate outputsContinuous monitoring42%
4Gate promotionKill switch / termination60%

The pattern is clear. Organizations invested in Ring 3 and skipped Rings 1, 2, and 4. They built the middle of the defense and left the boundaries open. This is precisely what How Agents Stay in Bounds predicted: every ring you skip is a ring you inherit as runtime risk.

Why Organizations Built Observation First

The investment pattern is not irrational. Observation is easier to implement than containment. Monitoring can be added to existing agent deployments without redesigning the architecture. Log aggregation, anomaly detection, and human review queues are familiar patterns that security teams already understand from traditional application monitoring.

Containment requires architectural decisions made before deployment. Ring 1 requires scoping the agent’s context at design time. Ring 2 requires sandbox infrastructure that isolates the runtime. Ring 4 requires promotion gates that can halt or reverse agent actions. These are not things you bolt on after the agent is running. They are things you build into the deployment architecture from the start.

You Can Build This. Three Artifacts and a Sandbox. described the build order: constraint first, then agent definition, then gate, then sandbox. That order is deliberate. You define what the agent can do before you define the agent. You define what success looks like before you let the agent run. You build the sandbox before you execute inside it. The organizations in the Kiteworks survey did it the other way around. They deployed agents, then added monitoring, and now they are discovering that monitoring without containment leaves them watching problems they cannot fix.

Government Is in the Worst Position

The Kiteworks data includes a sector breakdown that makes the structural argument even more concrete. Government agencies are in the worst position across every containment metric. 90% lack purpose-binding controls. 76% lack kill switches. A third have no dedicated AI controls at all.

This is not because government agencies are less capable. It is because they face the most acute version of the structural problem. Procurement cycles are long. Architecture decisions are locked into multi-year contracts. The agents are deployed by vendors into environments the agency does not control. Ring 1 requires scoping the agent’s context, but the agency may not have visibility into what context the vendor’s agent receives. Ring 2 requires environmental isolation, but the agent may run in the vendor’s cloud, not the agency’s infrastructure.

The containment model assumed the organization controls the deployment environment. When that assumption breaks, when the agent runtime is managed by a third party, containment becomes a contractual problem, not just a technical one. The four rings still apply, but the implementation shifts from infrastructure configuration to vendor requirements and service-level agreements.

100% Have Agents on the Roadmap

The most striking number in the survey is not about failures. It is about ambition. 100% of organizations surveyed have agentic AI on their roadmap. Zero exceptions. Across 225 organizations, ten industries, eight regions, not a single organization plans to abstain from agent deployment.

This means the governance-containment gap is not a temporary condition that some organizations will avoid. It is the default starting position for every organization that deploys agents. The question is not whether an organization will face this gap. The question is whether they close it before something goes wrong.

What AgenticOps Actually Looks Like introduced three non-negotiables: no generation without defined contracts, no promotion without evaluation, no runtime without observability. The Kiteworks data suggests a fourth: no deployment without containment. Observability alone is not governance. It is the prerequisite for governance, not the thing itself.

Closing the Gap

The gap closes in a specific order. Ring 1 first: define what each agent is allowed to do, not in policy documents but in scoped contexts and tool allowlists. Ring 2 next: isolate the agent’s execution environment so that the boundary is physical, not instructional. Ring 4 last: build promotion gates that can halt or reverse agent actions when evaluation criteria fail. Ring 3, observation, is what most organizations already have. The gap is everything else.

This is the same build order from You Can Build This applied at enterprise scale. Write a constraint. Define the agent. Write a gate. Run it in a sandbox. The artifacts are different when the agent is a customer service bot instead of a code generator, but the pattern is identical. Scope the input. Isolate the environment. Verify the output. Gate the promotion.

The Agents of Chaos study also found that 75% of leaders will not let security concerns slow their AI deployment. This means the gap will not close by slowing down. It will close by building containment infrastructure that runs at deployment speed. How Agents Stay in Bounds made this argument for individual agents. The Kiteworks data makes it for entire organizations.

The model was built to describe this problem. Now the problem has data. The four rings are no longer a conceptual model. They are a diagnostic tool. Map your agent deployments against the four rings. Where you have gaps, you have risk. Where you have all four, you have governance.

The data says most organizations are watching. Watching is not governing. Governing requires the ability to stop what you are watching. That requires infrastructure, not observation.

Let’s talk about it.

Your Code Works. Nobody Knows Why.

There is a new kind of debt accumulating in AI-assisted codebases. It does not show up in failing builds or broken deployments. The tests pass. Features ship. The system runs. But the shared understanding of why the system works the way it does is fragmenting, silently, faster than anyone expected. Five independent researchers named the same problem within six weeks of each other. They are calling it cognitive debt, and it may be the most important concept in AI-assisted development right now.

A Name for the Problem

In February 2026, Margaret Storey published a post that gave the velocity-comprehension gap a precise definition. Cognitive debt is the accumulated gap between a system’s evolving structure and a team’s shared understanding of how and why that system works and can be changed over time. Unlike technical debt, which surfaces through failing builds and broken deployments, cognitive debt “tends not to announce itself through failing builds but rather shows up through a silent loss of shared theory.”

Simon Willison amplified the concept. Addy Osmani independently arrived at the same idea with “Comprehension Debt,” arguing that code has become cheaper to produce than to perceive. Five independent authors converged on the same concept within six weeks. When that happens, the concept is not a trend. It is a structural observation that multiple people noticed simultaneously because the underlying condition became impossible to ignore.

Storey grounded the concept in Peter Naur’s 1985 theory that a program is not its source code but a theory distributed across the minds of the people who built it. The source code is an artifact of that theory. When the theory fragments, when nobody on the team can explain why certain design decisions were made or how different parts of the system interact, the system becomes unmaintainable even when it functions correctly. AI accelerates this fragmentation because it generates structure faster than shared understanding can stabilize.

Why AI Makes It Exponentially Worse

Most Software Is Just CRUD. That’s Not the Problem. established an economic principle: anything that becomes cheap gets overproduced. AI made code generation cheap. The predictable result is overproduction of structure. More files, more functions, more abstractions, more system surface area, all generated faster than any team can internalize.

The traditional development cycle had a natural governor on cognitive debt. Writing code by hand forced the author to understand what they were building, at least at the moment of creation. The understanding might fade over time, but it existed at the origin. AI-assisted development removes that governor entirely. The agent generates a module. The developer reviews the output. The tests pass. The module ships. Six months later, nobody remembers why that module handles edge cases the way it does, because nobody decided it should handle them that way. The agent decided, and the decision was reasonable enough to pass review.

This is the scenario I described in I Was a 1x Coder at Best. AI Made Me a 0x Coder. when I said I write zero lines of code by hand. The 0x workflow works because I invested in containment infrastructure. But the honest version of that story includes the risk: every module the agent produces is a module whose design rationale exists only in the conversation context that created it. Once that context window closes, the rationale is gone unless something captures it.

Storey’s student teams demonstrated this concretely. By weeks seven and eight, students using AI assistance could not explain their own system’s architecture. The code worked. The tests passed. The team had lost the theory. They could not modify the system because they did not understand the design decisions that shaped it.

Technical Debt Has a Feedback Signal. Cognitive Debt Does Not.

Technical debt announces itself. Builds break. Deployments fail. Performance degrades. The system tells you something is wrong, and you fix it or you don’t. Either way, you know. Cognitive debt is silent. The system continues to function. Features continue to ship. The team continues to deliver. The problem only surfaces when someone needs to change something they do not understand, and by then the cost of understanding has compounded to the point where the change is more expensive than rewriting.

This asymmetry explains why cognitive debt is more dangerous than technical debt in AI-assisted environments. Technical debt accumulates linearly. You cut a corner, you pay interest, you can measure the interest and decide when to pay it down. Cognitive debt accumulates exponentially because each new module generated without shared understanding increases the surface area of the system that nobody fully grasps. The team’s capacity to understand does not grow. The system does.

Verification Beats Debugging described verification pipelines that make debugging unnecessary. Mutation testing, coverage gates, complexity limits. These are essential, and they address technical correctness. They do not address cognitive debt. A system can pass every verification gate while the team’s understanding of it decays. The gates prove the code works. They do not prove anyone knows why.

Knowledge Compression Is the Structural Response

What AgenticOps Actually Looks Like introduced six layers of the AgenticOps model. The sixth layer is Knowledge Compression. At the time, it may have seemed like an afterthought, a cleanup step after the important work of generation, evaluation, and promotion. Cognitive debt makes it clear that knowledge compression is not the last step. It is the step that determines whether the system remains governable over time.

Knowledge compression is the systematic reduction of system complexity into navigable artifacts. Not documentation in the traditional sense. Not the kind of documentation that gets written once and never updated. Compression artifacts are living summaries that capture the theory of the system in forms that humans can navigate quickly. One-page summaries. Lifecycle maps. Invariant lists. Decision logs that record not just what was decided but why, including the alternatives that were considered and rejected.

I’ve Never Fully Understood the Systems I Work In introduced these concepts: compression artifacts, lifecycle maps, invariant lists. At the time they were part of the argument that containment replaces comprehension. Cognitive debt makes the argument concrete. Without compression, the theory fragments. With compression, the theory is captured in artifacts that persist beyond the conversation context that created the code.

Compression Operates at Every Level

Compression is wider than decision logs. It operates at every level where theory can be lost, from individual commits to system architecture.

At the code level, compression means collapsing noisy history into navigable history. A feature implemented across thirty story-level commits contains the full record of false starts, revisions, and incremental progress. That granularity is useful during development and useless afterward. Compressing those thirty commits into a single feature-level commit with a clear description of what changed and why preserves the theory without preserving the noise. The detailed history still exists in the branch. The compressed version is what future developers encounter first.

At the decision level, compression means capturing the reasoning chain that led to each significant choice. A system I have been building tracks this through a document pipeline: a problem document captures the problem in the language of the requester, a research document explores the problem space, an options document maps the solution space, a decision document explains which option was chosen and why the alternatives were rejected, and a plan document breaks the chosen solution into deliverable work items. Each document compresses a phase of reasoning into a navigable artifact. Six months later, when someone asks “why does this system handle authentication this way,” the answer is not buried in a Slack thread or a closed pull request. It is in the decision document, linked to the options that were considered and the problem that started the whole chain.

At the architecture level, compression means maintaining living documents that describe the system as it actually runs, not as it was originally designed. Blueprints capture the current architecture. Guides capture how users and developers interact with it. When a work item changes a feature, the documents that describe that feature update as part of the same workflow. The architecture documentation and the code stay in sync because they are governed by the same process, not maintained by separate teams on separate schedules.

This is the feedback loop from What AgenticOps Actually Looks Like made concrete. Compressed knowledge feeds back into intent. The decision documents inform the next problem document. The blueprints inform the next options document. The guides inform the next plan. Without that loop, each development cycle starts from scratch. With it, each cycle builds on the accumulated theory.

Compression Is Not Documentation

Traditional documentation fails because it is a separate activity from development. It gets written after the fact, if at all, and it drifts out of sync with the code the moment someone changes something without updating the docs. The document pipeline described above succeeds because it is the development process, not a parallel activity. The problem document is written before development starts. The decision document is written before implementation begins. The blueprint updates when the feature ships. There is no separate “documentation phase” that can be skipped under deadline pressure.

This is the same pattern from You Can Build This: constraints define the space, agents work inside it, gates verify the output. Adding compression to the gate criteria means the system self-documents as it builds. No promotion without a decision document. No merge without a compressed commit history. No deployment without updated blueprints. These are not burdensome when agents generate them alongside the code. They are only burdensome when a human writes them by hand after the fact.

The result is that any team member, or any agent, can reconstruct the theory of the system by reading the compressed artifacts. They can trace from a production behavior back through the blueprint, to the decision that shaped it, to the options that were considered, to the problem that started the work. That traceability is the structural defense against cognitive debt. The theory is not in anyone’s head. It is in the repository, navigable and current.

The Velocity-Comprehension Contract

Cognitive debt reframes the relationship between velocity and understanding. The traditional framing was: move fast, accumulate debt, pay it down later. The cognitive debt framing is: move fast, lose the theory, discover later that you cannot pay it down because you no longer understand what you built.

The AgenticOps response is not to slow down. It is to make compression a structural requirement of the development process. Generate fast. Compress immediately. The velocity stays. The theory stays too. This is what I’ve Never Fully Understood the Systems I Work In meant by “I scale containment, not understanding.” The containment now includes containing the theory of the system before it dissipates.

The industry is beginning to recognize this. Storey’s follow-up post synthesized the community response and noted that teams are experimenting with checkpoints where they rebuild shared understanding through code reviews, retrospectives, and knowledge-sharing sessions. These are behavioral responses, necessary but insufficient. The structural response is to make the checkpoints automatic, to embed them in the gate criteria so they cannot be skipped when deadlines compress.

Cognitive debt is real. It is measurable. It is accelerating. And the sixth layer of the model, the one that might have looked like cleanup, is a primary defense against it.

Let’s talk about it.

Agentic Engineering Is a Practice. AgenticOps Is the Infrastructure.

In early 2026, multiple people working independently arrived at the same conclusion about how professional developers should work with AI coding agents. They converged on a term: agentic engineering. The practices they describe are correct. But practices live in behavior, and behavior degrades under pressure. Nothing in the agentic engineering model enforces the practice when the practitioner is tired, rushed, or outnumbered.

The Industry Converged

In early 2026, Andrej Karpathy proposed the term “agentic engineering” to describe what professional developers actually do when they work with AI coding agents. Addy Osmani wrote the definitive guide. IBM published a formal definition. Simon Willison drew the line that separates it from vibe coding: “I think the borderline is when you take responsibility for the code, and stop blaming the LLM for any mistakes.”

The core claims are familiar. Humans own architecture and quality. Agents handle implementation. Testing is the single biggest differentiator between disciplined and undisciplined AI-assisted work. Specifications before prompts produce better output than prompts alone. These are not new observations. I’ve Never Fully Understood the Systems I Work In described the same boundary between human judgment and agent execution. How Agents Stay in Bounds formalized it as four rings of containment. The language differs. The structure is the same.

What makes this convergence meaningful is not that smart people agree. It is that they agree independently, from different starting points, using different evidence. When multiple observers reach the same conclusion through different paths, the conclusion is likely structural rather than stylistic.

The Practice Is Necessary But Not Sufficient

Agentic engineering describes how a developer should work. Write specifications first. Review agent output with the same rigor as human code. Test relentlessly. Maintain architectural discipline. These are practices, and they are correct. Osmani’s observation that “agentic engineering actually rewards strong fundamentals more than traditional development” matches exactly what I found building the system described in You Can Build This. Three Artifacts and a Sandbox.

The problem is that practices depend on practitioners. A developer following agentic engineering principles produces governed output. A developer who skips the specification step, or merges without reviewing the diff, or runs without tests, produces ungoverned output. The practice has no mechanism to enforce itself. It relies on discipline, and discipline degrades under pressure. Deadlines compress. Scope expands. The careful developer who reviews every diff at 10 AM rubber-stamps the last three at 6 PM.

Any approach that lives in behavior rather than infrastructure has this limitation. Verification Beats Debugging made the same argument in the AgenticOps Applied series: the fix is not better discipline, it is verification pipelines that make discipline unnecessary. What matters is what happens when developers do not follow the practice, because eventually they will not.

Where Agentic Engineering Stops

The agentic engineering literature covers three activities well. Intent specification: write a design document or task description before prompting. Agent generation: let the agent implement within a scoped context. Evaluation: review the output, run tests, verify correctness. These map to the first three layers of the AgenticOps model from What AgenticOps Actually Looks Like: Intent, Agent Generation, and Evaluation.

The literature is largely silent on what happens after evaluation. Promotion, the controlled movement of verified work into production environments, is rarely addressed. Runtime governance, the observation and constraint of agent behavior in live systems, appears only in security-focused discussions. Knowledge compression, the systematic reduction of system complexity into navigable artifacts, is almost entirely absent.

These are layers four through six. They are the layers that turn a development practice into an operational system. Without them, agentic engineering produces high-quality work in development and hopes it stays that way in production.

The feedback loop matters. Knowledge compression feeds back into intent. The compressed understanding of how the system behaves in production shapes the next round of specifications. Without that loop, each development cycle starts fresh. With it, each cycle compounds on what the system learned about itself.

Containment Is Not a Practice

The four containment rings from How Agents Stay in Bounds illustrate the difference most clearly.

  1. Constrain inputs.
  2. Constrain environment.
  3. Validate outputs.
  4. Gate promotion.

These are not things a developer does. They are things the system enforces.

Ring 1, constraining inputs, means the agent receives a scoped context defined by skill files and schemas. The developer did not decide at runtime what to include. The constraint existed before the agent started. Ring 2, constraining the environment, means the agent runs in a sandbox that physically prevents access to anything outside the workspace. The developer did not have to trust the agent not to wander. The environment made wandering impossible. Ring 3, validating outputs, means automated gates evaluate the result against measurable criteria. The developer did not have to judge quality from a diff. The gate returned a score. Ring 4, gating promotion, means nothing moves forward without evidence. The developer did not have to remember to check. The system refused to proceed without passing the gate.

OpenClaw Is Not an AI Assistant demonstrated this with three isolation layers and multiple containment rings around a real agent runtime: Docker sandboxes, tool sandboxes, allowlists, network restrictions, and human approval gates. The containment was not a developer practice. It was infrastructure configuration. The agent could not violate the boundary because the boundary was physical, not procedural.

Agentic engineering asks the developer to be disciplined. AgenticOps builds the system so that discipline is the default and violation is structurally difficult. The distinction is the same one from How Agents Stay in Bounds: policy says “don’t.” Infrastructure says “can’t.”

Practice Drifts. Infrastructure Holds.

A Hacker News commenter captured the concern precisely: “The effects of vibe coding destroy trust inside teams and orgs, between engineers.” The damage comes not from individual failures but from inconsistency. One developer follows the practice. Another does not. The codebase contains governed and ungoverned output that looks identical in a pull request.

Infrastructure eliminates this variance. When every agent session runs inside the same sandbox, uses the same constraint files, and passes through the same gates, the output quality has a floor. Individual developers can exceed the floor. They cannot go below it. Most Software Is Just CRUD. That’s Not the Problem. argued that the danger is cheap generation without constraint discipline. Infrastructure is how constraint discipline survives contact with a team of twenty developers, varying skill levels, and a Friday afternoon deadline.

The Treasure Data case study illustrates this concretely. They tried speed without structure first. Then they embedded governance into infrastructure, not policy documents. The result was one engineer shipping a production AI tool in an hour. The constraint was the accelerator. Governance infrastructure made them faster because it removed the decision overhead that slowed them down. Every developer on the team produced output that met the same quality bar because the quality bar was enforced by the system, not by individual judgment.

The Build Plan Still Works

The three artifacts from You Can Build This are the implementation of this distinction.

  1. Constraints are Ring 1.
  2. Agent definitions with tool allowlists and forbidden lists are Ring 2.
  3. Gates with measurable success criteria are Rings 3 and 4.

The sandbox makes containment physical. None of these require the developer to remember to be disciplined. They require the developer to define the constraint once and let the system enforce it on every run.

This is why the convergence matters. Agentic engineering identified the right practices. AgenticOps provides the infrastructure to make those practices the default. The industry does not need to choose between them. It needs both. The practice tells you what to do. The infrastructure ensures you actually do it.

I am glad the industry arrived at the same conclusion. The next question is whether they will build the infrastructure or stop at the practice.

Let’s talk about it.

Agent Runtimes Are Infrastructure Now

In the span of eight weeks, four companies shipped agent runtimes targeting the same architectural pattern. OpenClaw went from 9,000 to 68,000 GitHub stars. Perplexity launched Computer. Anthropic launched Dispatch. NVIDIA announced NemoClaw at GTC. A wave of open-source alternatives jumped in too.

They are solving different problems for different audiences. But they converge on the same structural claim: agents need a long-running runtime with containment boundaries, not a chat window.

That convergence is the signal. Agent runtimes are no longer experimental tooling. They are infrastructure and most companies have no plan for running them.


The Problem

Most organizations interact with AI through two modes: chat interfaces and copilot integrations. Both are interactive. A human types, the model responds, the human reviews. The loop is tight. The blast radius is small. The human is always present.

Agent runtimes break that model.

An agent runtime is a persistent process that connects a language model to tools that operate on real systems. It reads files, runs commands, calls APIs, and manages state across sessions. It does not wait for a human to type the next instruction. It plans, executes, evaluates, and continues.

The shift from interactive to autonomous changes everything about how you govern AI in your organization. Permission models designed for copilots do not work when the agent runs overnight. Approval gates designed for chat do not work when the agent has already executed forty tool calls before anyone checks. Cost controls designed for per-query billing do not work when a runtime burns tokens continuously.

Most companies are not ready for this. They have AI policies written for chatbots. They have security reviews scoped to API integrations. They have cost projections based on per-seat licensing.

None of that applies to a long-running autonomous process with tool access.


Why It Breaks

The failure mode is not dramatic. It is gradual.

A team installs OpenClaw on a developer’s machine. It works well for code review and research tasks. Someone gives it shell access. Someone else connects it to the company’s GitHub. A third person sets up a cron job to run it overnight.

No one wrote a containment policy because no one thought of it as infrastructure. It was just a tool someone installed.

Then the agent runtime has persistent access to production repositories, runs unattended, makes commits, and calls external APIs. The blast radius expanded incrementally. Each step seemed reasonable in isolation. The compound effect is an autonomous process with broad access and no governance boundary.

This is the same pattern that produced shadow IT fifteen years ago. Except shadow IT was humans using unauthorized tools. Shadow agents are autonomous processes using authorized tools without authorized oversight.

Three dynamics make this worse than traditional shadow IT.

First, agents are stochastic. The same input does not always produce the same output. A shell command that worked safely yesterday might produce a different command today. Deterministic tools with stochastic invocation is a new failure class.

Second, agents compound. A single tool call is low risk. An agent that chains forty tool calls in sequence can reach states that no individual call would produce. The risk is in the composition, not the components.

Third, agents persist. A copilot session ends when the developer closes the tab. An agent runtime runs until someone stops it. Long-running processes accumulate context, make decisions based on stale state, and operate during hours when no one is watching.

Without containment infrastructure, every team that installs an agent runtime creates an ungoverned autonomous process. Multiply that across an organization and you have a fleet of agents with no central visibility, no consistent policy, and no kill switch.


The Fix

The fix is not a policy document. The fix is treating agent runtimes as infrastructure that requires the same operational discipline as any other long-running service.

Three concrete requirements.

1. Sandboxed Execution with Declarative Policy

Agent tool execution must run inside an isolated environment with policy controls that the agent cannot modify.

This is exactly what NemoClaw’s OpenShell runtime provides. Each agent session runs inside a sandbox with a YAML policy file that declares which files the agent can access, which network endpoints it can reach, and which tools it can invoke.

# openclaw-sandbox.yaml
filesystem:
writable:
- /sandbox
- /tmp
read_only: everything_else
network:
allowed:
- build.nvidia.com
- api.anthropic.com
denied: everything_else
tools:
allowed:
- read
- write
- exec
denied:
- cron
- messaging

The policy is enforced by the runtime, not by the agent. When the agent tries to reach an unlisted host, OpenShell blocks the request. The agent does not get to decide whether the policy applies.

Dispatch takes a different approach to the same problem. Code runs in a sandbox, files stay local, and every destructive action requires user confirmation via push notification. The containment is structural. The agent pauses and waits for human approval before crossing a boundary.

Perplexity Computer takes a third approach. Move everything to the cloud. The agent runs on Perplexity’s infrastructure, not on your machine. Your files, your apps, your network are not directly exposed. The containment boundary is the cloud itself. The tradeoff is control. You gain isolation by giving up locality.

All three approaches enforce the same principle: the environment says “can’t,” not “shouldn’t.”

2. Cost Containment as a Runtime Concern

Long-running agents consume tokens continuously. Without budget enforcement at the runtime level, costs scale with uptime, not with value delivered.

Post 8 in this series described a budget daemon that polls agent sessions every five minutes, calculates token cost deltas, and enforces three tiers: warning at 80%, throttle at 100% of a daily limit, hard kill at a monthly cap. The throttle mechanism writes a flag file and blocks the agent at the gateway level. The agent does not know it has been throttled. It simply cannot start new sessions.

NemoClaw supports local inference through Nemotron models, which eliminates token costs entirely for workloads that can run on local hardware. Instead of metering cloud tokens, you shift inference to hardware you already own.

Perplexity Computer takes a subscription approach. $200 per month for 10,000 credits. After that, per-credit billing. The cost is predictable until it is not. A workflow that runs for hours or months, which Perplexity explicitly supports, can exhaust credits faster than anyone budgeted for. Subscription pricing obscures the relationship between agent activity and cost.

Three different cost models. Token metering, local inference, and subscription credits. All three treat cost as a runtime constraint, not a billing surprise. But only explicit metering gives you the visibility to understand what each agent actually costs.

3. Separation of Build and Run

The agent that builds the runtime must not run inside it. The agent that writes the budget daemon must not have its spending governed by that daemon. The agent that configures the sandbox policy must not be sandboxed by that policy.

This is the structural separation described in Post 8. Claude Code planned and implemented the OpenClaw deployment. At no point did it run inside OpenClaw. The orchestrating agent and the deployed runtime operate in separate containment boundaries.

Dispatch enforces this separation by architecture. The runtime runs on your desktop. The control interface runs on your phone. The command channel is end-to-end encrypted. The agent cannot modify the channel it receives commands through.

Perplexity Computer enforces this separation by moving the entire execution environment to the cloud. The agent runs on Perplexity’s servers. You interact through a client. The agent cannot modify the client or the subscription boundary that governs its compute allocation.

The pattern is consistent across all four systems: the control plane is not subject to the data plane’s constraints.


Four Runtimes, One Pattern

OpenClaw, Perplexity Computer, Dispatch, and NemoClaw approach the problem from different directions. They arrive at the same architecture.

PropertyOpenClawPerplexity ComputerDispatchNemoClaw
Runtime modelSelf-hosted Node.js daemonCloud-hosted, multi-model orchestrationManaged desktop agentOpenClaw + OpenShell wrapper
ContainmentDocker Sandbox, tool sandboxingCloud isolation, vendor-managedLocal execution, human gatesYAML policy, filesystem/network isolation
InferenceCloud APIs (bring your own key)19 models (Opus 4.6, Gemini, Grok, others)Anthropic models onlyNemotron local or cloud APIs
Cost modelToken metering (user-built)$200/month subscription + per-credit overage$100-200/month subscriptionLocal inference or cloud metering
PersistenceJSONL session transcriptsCloud-managed workflow stateSingle persistent conversationBlueprint-versioned sandbox state
Target audienceDevelopers, self-hostersKnowledge workers, enterprisesConsumers, knowledge workersEnterprise, IT teams
Governance postureConfigurable, user-managedVendor-managed, opaqueOpinionated, Anthropic-managedDeclarative, policy-as-code

The convergence is in the structural properties, not the implementation details.

All four run as persistent processes, not request-response APIs. All four connect language models to tools that operate on real systems. All four enforce containment boundaries that the agent cannot override. All four separate the control plane from the execution environment.

That is not four companies making the same product. That is four companies independently validating the same architectural pattern.

The Open-Source Wave

The pattern is replicating beyond the major players. OpenClaw’s explosion triggered a wave of open-source agent runtimes, each optimizing for a different constraint.

ZeroClaw is a Rust-native runtime that compiles to a 3.4MB binary and runs on under 5MB of RAM. PicoClaw, written in Go, hit 12,000 GitHub stars in its first week. Nanobot from HKU delivers core agent runtime features in 4,000 lines of Python with 26,800 stars. IronClaw rewrites the entire stack in Rust with WebAssembly sandboxing where every tool starts with zero permissions and must be explicitly granted access.

The common thread is not the language or the size. It is that every one of these projects treats containment as a first-class concern, not a feature request. The early OpenClaw criticism, that it shipped powerful tools with minimal default governance, taught the ecosystem a lesson. The second wave of runtimes launched with sandboxing built in.

That is the pattern maturing in real time.


What This Means for Every Company

The question is no longer whether your organization will run agent runtimes. The question is whether you will govern them before or after they are already running.

OpenClaw has 68,000 GitHub stars. Any developer in your organization can install it in five minutes. Perplexity Computer is a subscription away. Dispatch ships to every Claude Max subscriber. NemoClaw runs on any NVIDIA hardware.

The barrier to deploying an autonomous agent is now lower than the barrier to writing a containment policy for one.

Three things every organization should do now.

First, inventory what is already running. If your developers use Claude Code, Cursor, OpenClaw, or any tool that connects a language model to a shell, you already have agent runtimes in your environment. Most IT teams do not know this. Find out.

Second, define a containment baseline. Not a policy document. An actual runtime configuration that enforces filesystem isolation, network restrictions, and tool allowlists. NemoClaw’s YAML policy format is a reasonable starting point. If you are not using NemoClaw, build the equivalent for whatever runtime your teams use.

Third, treat agent runtime governance as infrastructure, not as AI ethics. The team that owns this is platform engineering or SRE, not the AI committee. The artifacts are sandbox configs, network policies, and budget daemons. The review process is the same one you use for any other production service.

Agent runtimes are not a trend. They are the next layer of compute infrastructure. The companies that learn to run them with containment discipline will compound their capabilities. The companies that ignore them will discover shadow agents the same way they discovered shadow IT. After the damage is visible.


Stories from Production

The OpenClaw Explosion

OpenClaw went from 9,000 to 68,000 GitHub stars in days during late January 2026. Creator Peter Steinberger announced he would join OpenAI, and the project would move to an open-source foundation. The growth was driven by a single property: OpenClaw is a self-hosted agent runtime that you control. No vendor lock-in, model-agnostic, runs on your machine.

Security researchers immediately began demonstrating prompt injection and malicious skill attacks against agents with broad access. CrowdStrike published guidance for security teams. The pattern was exactly what Post 6 in this series predicted: powerful runtime, minimal default containment, governance as an afterthought.

Perplexity shipped Computer on February 25. A cloud-hosted agent runtime that orchestrates 19 different models. Opus 4.6 for reasoning. Gemini for deep research. Grok for lightweight tasks. Workflows that run for hours or months. The pitch was accessibility. Perplexity CEO Aravind Srinivas said, “Even your mum can text the app and delegate tasks.” The containment model is cloud isolation. Your local machine is never exposed because the agent never runs on it.

Then Perplexity shipped the Agent API on March 11. A managed runtime for developers that orchestrates retrieval, tool execution, reasoning, and multi-model fallback. This moved Perplexity from consumer product to infrastructure provider. The same pattern, packaged as a platform.

NVIDIA announced NemoClaw at GTC on March 16. OpenShell sandboxing, YAML policy controls, local Nemotron inference. The enterprise wrapper that OpenClaw needed but could not build as a one-person open-source project.

Anthropic launched Dispatch the same week. A managed desktop agent runtime with structural containment baked in. No shell access unless the sandbox allows it. Destructive actions gated by push notification. End-to-end encryption on the control channel.

Four approaches. Eight weeks. Same pattern. That is convergence.

The Shadow Agent Scenario

A mid-size engineering team installs OpenClaw on developer machines for code review automation. It works well. Someone adds a skill that connects to the company’s Jira instance. Someone else adds GitHub write access. A third developer sets up a scheduled task that runs the agent overnight to triage incoming issues.

Six months later, the agent has made 2,000 commits across twelve repositories, closed 400 issues, and consumed $3,200 in API tokens that no one budgeted for. The security team discovers it during an audit. They have no visibility into what the agent did, no log of which tools it invoked, and no policy that governs its access.

This has not happened yet. But every component exists today. OpenClaw supports scheduled execution. GitHub skills are preconfigured. Token costs are invisible unless you build metering infrastructure. The only thing preventing this scenario is the gap between installation ease and governance maturity. That gap is closing. Fast.

Let’s talk about it.