Every AI agent needs a blast radius: the case for microVM isolation

Where an agent runs decides what it can damage when prompt injection lands. A walk through the four levels of isolation in common use, and why microVMs have become the production standard.

Levain Labs 5 min read
A row of reinforced vault doors, representing hardware-enforced isolation boundaries around agent execution

When you deploy an AI agent, you’re shipping a process that reads inputs from sources you don’t control, decides what to do based on what it reads, and takes action on your systems. If an attacker can influence what the agent reads, they can influence what it does. The question of where the agent physically runs determines how much damage it can do when something goes wrong.

This post is about that question: where agents execute, the levels of isolation in common use, and why microVMs have become the production standard.


A new kind of security problem

Traditional software keeps a clean line between code (trusted, written by your team) and data (untrusted, coming from users). AI agents erase that line. An agent reads a document, an email, or a web page, then acts on what it found. OWASP’s 2025 Top 10 for LLM applications puts prompt injection (tricking an agent through its inputs) at number one.

The exploits are real. Security researchers at Trail of Bits demonstrated remote code execution against three popular AI agent platforms in October 2025. Palo Alto’s Unit 42 reported the first real-world attack against a live AI ad-review system in March 2026. OpenAI has publicly acknowledged that prompt injection may never be fully solved.

Every agent execution should be treated as potentially adversarial. The blast radius when things go wrong is determined by where the agent runs.


Four levels of isolation

In-process execution. The agent runs inside your application, with access to whatever your app can reach. Fine for prototypes, dangerous in production.

Standard containers (Docker and friends). The agent gets its own filesystem and network, but shares the host operating system with every other container. A bug in that shared layer lets malicious code reach the host or a neighboring container, and runtime vulnerabilities of this kind are disclosed on an ongoing basis.

gVisor and similar user-space approaches. A middle-ground technology that intercepts what the agent tries to do before it reaches the host. Better than containers, but still shares more than you might want.

microVMs. Each agent execution runs in its own miniature virtual machine, with its own operating system, separated from the host by hardware. Breaking out requires defeating the hardware boundary, which is dramatically harder than escaping a container. This is what AWS uses under Lambda.

Level 1
Host
Shared kernel
App process
Agent
In-process
No isolation from your app
Level 2
Host
Shared kernel
Container
Agent
Container
Namespace separation, shared OS
Level 3
Host
Shared kernel
User-space sandbox
Agent
gVisor
Syscalls intercepted in software
Level 4
Host
Hardware boundary
Guest OS
Agent
microVM
Isolation enforced by the CPU
Each level adds a stronger boundary around agent execution. Only microVMs replace the shared kernel with a hardware-enforced one.

Why containers fall short

Containers work well for the problem they were designed for: isolating cooperating services run by a single team. They were not built to contain adversarial code, and the agent threat model is exactly that.

The core issue is the shared operating system. Every container on a host runs on the same OS instance, which means a bug in that layer can let code inside a container reach the host or its neighbors. Linux kernel CVEs that allow privilege escalation from inside a container are disclosed on an ongoing basis; the shared kernel is the reason they matter in a container threat model.

AI agents make this worse in two specific ways. They generate code at runtime that nobody has reviewed, and they accept instructions from external content you don’t control. The combination is effectively an unreviewed code execution service running on your infrastructure, which is the exact threat model containers were not designed for.


The microVM approach

A microVM treats each agent execution as a hostile workload and gives it its own operating system, memory, and virtual hardware. The separation is enforced by the CPU, not by software conventions. An attacker has to break the hardware boundary to affect anything outside their VM.

In a microVM, isolation is enforced by hardware. In a container, it’s enforced by software conventions.

The performance picture has changed enough over the last two years that microVM overhead is no longer a meaningful tradeoff. Agent executions start in about 150 milliseconds inside a microVM, which is noise compared to the LLM call the agent is about to make. There’s no longer a reason to accept weaker isolation in exchange for speed.


How Levain Labs runs agents

Every agent execution on our platform runs in its own microVM, built on AWS Firecracker (the same technology behind AWS Lambda):

  • One microVM per agent run. No reuse across customers or agent versions. When a run finishes, the VM is destroyed.
  • Dedicated operating system per VM. No shared layer between tenants.
  • Network restricted to explicit allowlists. Agents reach the services they’re configured for, and nothing else. This contains the common pattern of an agent being tricked into sending data somewhere it shouldn’t.
  • Ephemeral storage. Anything the agent writes during a run disappears with the VM.
  • Hard limits on time, CPU, and memory. Enforced at the hardware level, so a runaway agent can’t drain shared resources.

This is “Harnessed Intelligence” in practice. Capability without boundaries is a liability, and boundaries enforced at the hardware level are the most durable kind.

— The Levain Labs team

Sources

  1. OWASP LLM01:2025 Prompt Injection

    OWASP Gen AI Security Project, 2025. Ranks prompt injection as the top risk in the 2025 LLM Top 10.

  2. Trail of Bits: Prompt injection to RCE in AI agents

    October 2025. Demonstrates remote code execution against three popular AI agent platforms.

  3. Palo Alto Unit 42: Fooling AI Agents — Web-Based Indirect Prompt Injection Observed in the Wild

    March 2026. First reported real-world attack of this class in a live AI system.

  4. TechCrunch: OpenAI says AI browsers may always be vulnerable to prompt injection attacks

    December 2025. TechCrunch reporting on OpenAI's public acknowledgement.

  5. AWS Firecracker

    The reference design for production microVM isolation.