AI Agents Now Code Safely Inside Isolated Sandboxes

April 20, 20262 min read

TL;DR

A new secure environment lets AI write and run code without touching your systems, blocking harmful actions while keeping agent features fully intact.

Artificial intelligence agents are becoming increasingly capable of writing and running code, which allows them to analyze data, call APIs, and build applications from scratch. This capability makes agents significantly more useful for practical tasks, but it also introduces serious security risks when these AI systems can execute arbitrary code without proper isolation from your infrastructure. has been how to give AI agents the power to code while protecting systems from potential destructive or malicious actions that unpredictable agent-generated code might attempt.

Researchers have developed a solution called LangSmith Sandboxes, which are secure, scalable environments specifically designed for running untrusted code generated by AI agents. These sandboxes provide ephemeral, locked-down environments where agents can execute code safely, with administrators maintaining control over what resources the code can access and consume. This approach addresses the fundamental problem that traditional containers, designed for known and vetted application code, aren't suitable for the unpredictable nature of agent-generated code that might attempt anything from legitimate operations to malicious commands.

Ology involves creating isolated environments that can be spun up quickly using a single line of code with the LangSmith SDK. Developers simply add their API key, pull in the SDK, and they can immediately deploy these secure sandboxes. The system handles the complex security requirements automatically, including spinning up containers, locking down network access, piping output back to agents, and tearing everything down when execution is complete. This eliminates the need for developers to build secure code execution systems themselves, which typically requires managing containers, implementing resource limits, and handling security configurations manually.

These sandboxes integrate directly with existing LangSmith infrastructure, meaning developers already using the Python or JavaScript client for tracing or deployment can implement sandboxes without adding new tools or frameworks. The system also integrates with LangSmith Deployment, allowing sandboxes to be attached directly to agent threads for seamless operation. Native integrations exist with LangChain's Deep Agents open source framework and with projects like Open SWE, demonstrating practical applications where these secure environments have already been tested internally before being made available to developers.

Resource management represents a critical component of this approach, as agents running code can rapidly consume CPU, memory, and disk resources if left unconstrained. The sandbox system implements automatic resource limits to prevent runaway consumption, addressing a problem that compounds as more AI agents become coding agents capable of generating and executing their own code. This controlled environment allows organizations to safely deploy AI coding agents like Cursor, Claude Code, and OpenClaw without worrying about system stability or security breaches.

Of this technology extend across multiple domains where AI agents need to interact with code. Researchers can use these sandboxes to safely test agent-generated code without risking their primary systems, while businesses can deploy AI coding assistants that can actually execute their suggestions in secure environments. The technology enables new types of applications where AI agents can build software, analyze sensitive data, or interact with external APIs while maintaining complete isolation from core infrastructure.

Current limitations include the system's availability only in Private Preview, meaning access is currently restricted while the technology undergoes further development and testing. The researchers acknowledge they are actively exploring additional features beyond what's available today, though specific details about these future developments aren't provided in the current release. As with any new security technology, real-world testing will be necessary to validate the sandboxes' effectiveness against sophisticated attacks and edge cases that might emerge as more developers implement AI coding agents in production environments.