Code Execution
Code execution lets an agent run JavaScript or TypeScript source on demand inside an isolated sandbox. The sandbox has no global access to its host — every capability (fs, fetch, console, custom business logic) is explicitly bridged in through imports and globals. Results flow back via the export selected by options.execute or through bridged “report” callbacks.
This page is the caller-facing contract: how to invoke runCode, what options it accepts, what shape the result has, and the limitations agent authors need to know when writing code that will run inside the sandbox. For the runtime-side mechanics (isolate setup, enforcement, implementation approaches), see Runtime §13 Code Execution Runtime.
1. What You Get
When you call state.runCode(source, options):
- The source runs as an ES module in a fresh sandbox with no host capabilities.
- Anything you pass through
options.importsis resolvable as a bare-specifier import inside the sandbox. - Anything you pass through
options.globalsis resolvable as a free identifier inside the sandbox. options.executechooses which module export to run and which arguments to pass. It defaults to{ fn: 'default', args: [] }.- The call returns a
CodeExecutionhandle — awaitable, terminatable, with a livereportsview. - The spec does not mandate a wall-clock timeout; callers impose their own by calling
handle.terminate().
Code execution is not a Node.js runtime, a web runtime, or a Worker runtime. It is a plain ECMAScript evaluator. The surface is identical across host platforms.
2. API Surface
Code execution is invoked through ThreadState.runCode(), which returns a CodeExecution handle. The handle is awaitable and exposes a terminate() method for caller-initiated cancellation.
2.1 Signature
runCode(
source: string,
options?: CodeExecutionOptions,
): CodeExecution;
| Parameter | Type | Description |
|---|---|---|
source | string | JavaScript or TypeScript source text. Treated as an ECMAScript module. |
options | CodeExecutionOptions | Export selection, imports, globals, memory cap, and language hints. |
2.2 CodeExecution Handle
interface CodeExecution extends PromiseLike<CodeExecutionResult> {
/** Stop the run. Idempotent. Settles the handle with `status: 'terminated'`. */
terminate(reason?: string): void;
/** `true` until the handle settles. */
readonly running: boolean;
/** Snapshot of values emitted via the `report` bridge so far, in call order. */
readonly reports: readonly unknown[];
}
- The handle is a
PromiseLike<CodeExecutionResult>;await runyields the final result. terminate(reason?)stops the run. Subsequent calls are no-ops. Whenreasonis provided, it surfaces inresult.error.message.runningflips tofalseonce the handle settles.reportsis a live, append-only view of values emitted via thereportbridge. The same values appear inCodeExecutionResult.reportswhen the run ends.
Callers that need a time budget layer one on themselves:
const run = state.runCode(source, options);
const budget = setTimeout(() => run.terminate('30s budget'), 30_000);
const result = await run;
clearTimeout(budget);
2.3 Example
const run = state.runCode(
`
import { readFile } from 'fs';
import { report } from 'supervisor';
const message = await getMessage();
if (/username/.test(message)) {
report({ topic: 'username', message });
}
export async function scan() {
return { scanned: true };
}
`,
{
execute: { fn: 'scan', args: [] },
imports: {
fs: {
readFile: async (path: string) => state.readFile(path),
},
supervisor: {
report: (payload: unknown) => {
// host-side sink
},
},
},
globals: {
console: { log: (...args: unknown[]) => {} },
getMessage: async () => 'latest user message',
},
},
);
const result = await run;
if (result.status === 'success') {
// result.result is the resolved scan() return value: { scanned: true }
}
3. CodeExecutionOptions
| Option | Type | Default | Description |
|---|---|---|---|
execute | { fn?: string; args?: unknown[] } | { fn: 'default', args: [] } | Export to execute and arguments to pass if the export is a function. Use fn: 'default' for the default export. |
imports | Record<string, Record<string, unknown>> | {} | Map of module specifier → named exports exposed to the sandbox. Resolves bare-specifier imports only. |
modules | Record<string, string> | {} | Additional relative ES modules available to the sandbox, keyed by relative specifier such as './helpers.js'. |
globals | Record<string, unknown> | {} | Map of identifier → value installed on the sandbox’s module scope. |
language | 'javascript' | 'typescript' | 'typescript' | Source language. TypeScript source is stripped of types before evaluation; no type checking is performed. |
memoryLimitBytes | number | runtime-defined | Maximum sandbox heap in bytes. Exceeding this settles the handle with status: 'memory'. |
filename | string | '<runCode>' | Label used in stack traces and error sources. |
report | (value: unknown) => void | — | Optional host-side sink invoked for every value emitted through the built-in report bridge. Collected values are also exposed on the handle’s live reports array and in the final CodeExecutionResult.reports. |
Unknown keys on options are rejected.
Deadlines are not an option. The caller implements their own budget by calling handle.terminate() (§2.2).
3.1 Execute
execute selects the module export that becomes CodeExecutionResult.result.
const run = state.runCode(
`
export function increment(n: number): number {
return n + 1;
}
export default function fallback() {
return 123;
}
`,
{
execute: { fn: 'increment', args: [100] },
},
);
Rules:
- When
executeis omitted, the runtime behaves as if{ fn: 'default', args: [] }was supplied. fnnames the export to execute. Use'default'for the default export.- If the selected export is a function, the runtime calls it with
argsinside the sandbox and awaits any returned thenable. - If the selected export is not a function,
argsMUST be omitted or empty and the export value itself becomes the result. - Missing selected exports fail the run with
status: 'link_error'. argsuse the same marshaling rules as other sandbox boundary values.
3.2 Imports
imports maps a bare specifier to an object of named exports.
imports: {
fs: {
readFile: async (path) => { /* ... */ },
writeFile: async (path, data) => { /* ... */ },
},
}
Inside the sandbox:
import { readFile, writeFile } from 'fs';
import * as fs from 'fs'; // namespace import of the same object
The default export of a bridged module is the value at key default:
imports: { greeter: { default: (name) => `hi, ${name}` } }
// inside: import greet from 'greeter';
Rules:
- Only bare specifiers are resolvable from
options.imports('fs','supervisor','@scope/pkg'). Relative specifiers resolve fromoptions.modules; URL ('https://…') specifiers fail at module link time. - Specifiers are not resolved from any host package resolver, node_modules, CDN, or virtual module registry — only from
options.importsoroptions.modules. - Importing an unknown specifier fails linkage.
- Named-export mismatches (
import { x }wherexis missing) fail linkage.
3.3 Modules
modules provides local relative ES modules to the sandbox.
const run = state.runCode(
`
import { add } from './math.js';
export const result = add(1, 2);
`,
{
execute: { fn: 'result' },
modules: {
'./math.js': 'export const add = (a, b) => a + b;',
},
},
);
Rules:
- Keys MUST be relative specifiers from the module graph root, such as
'./helpers.js'or'./lib/math.js'. - Entry source imports resolve from
filenamewhen one is supplied. - Entry and module source MAY import sibling or parent modules with
./and../specifiers when those imports resolve within the supplied module graph. - Values are JavaScript or TypeScript module source and are evaluated inside the same sandbox.
modulesare not host filesystem reads and do not grant package, URL, CDN, ornode_modulesaccess.
3.4 Globals
globals installs identifiers at module scope. They are not installed on globalThis.
globals: {
console: { log: (...args) => {} },
getMessage: async () => '...',
}
Inside the sandbox:
console.log('hi'); // works
const m = await getMessage(); // works
globalThis.console; // undefined — globals are not on globalThis
This distinction matters for the limitations — sandbox code cannot enumerate bridged capabilities via Object.keys(globalThis).
3.5 Bridged Value Marshaling
Values cross the sandbox boundary in both directions (host → sandbox via imports/globals, sandbox → host via call arguments and return values).
| Host value | Sandbox view | Notes |
|---|---|---|
string, number, boolean, null, undefined, bigint | Same primitive | Cloned by value. |
Plain object / array / Map / Set / Date / typed array | Structured clone | Deep copy; no shared reference. |
Function | Callable proxy | Calls forward serialized args to host; return value (or resolved promise) marshaled back. |
Promise | Promise | Resolves/rejects asynchronously inside sandbox. |
ArrayBuffer / Uint8Array | Same binary view | Cloned. |
Class instance, WeakMap, WeakRef, symbol with host identity | Not transferable | May throw a SerializationError. |
Bridged functions do not expose host this or host closure state to the sandbox beyond their explicit arguments.
Asynchronous bridges are supported: a bridged function may return a Promise, and the sandbox’s await unwraps it.
4. CodeExecutionResult
interface CodeExecutionResult {
status: 'success' | 'error' | 'memory' | 'terminated' | 'link_error';
result?: unknown;
reports: unknown[];
logs: CodeExecutionLog[];
error?: CodeExecutionError;
durationMs: number;
memoryUsedBytes?: number;
}
| Field | Type | Description |
|---|---|---|
status | enum | Outcome classification. See §4.1. |
result | unknown | Resolved selected export, after optionally calling it with execute.args and awaiting top-level promises. Omitted on non-success outcomes. |
reports | unknown[] | Values emitted through the report bridge, in call order. Empty array if unused. |
logs | CodeExecutionLog[] | Captured console.* calls when the runtime installs a capturing console. Empty when the caller supplies its own. |
error | CodeExecutionError | Present iff status is not 'success'. |
durationMs | number | Wall-clock execution time in milliseconds, from runCode() invocation to handle settlement. |
memoryUsedBytes | number | Peak sandbox heap in bytes when the engine exposes heap-usage measurement; omitted otherwise. |
4.1 Status Values
| Status | Meaning |
|---|---|
success | Module evaluated; selected export resolved. |
error | Uncaught runtime error inside the sandbox. |
memory | memoryLimitBytes exceeded. |
terminated | Caller called handle.terminate(), or the runtime’s own safety cap fired. error.message distinguishes the two (including any reason the caller passed). |
link_error | Module graph failed to link (unknown specifier, missing named export, parse error). |
4.2 CodeExecutionError
interface CodeExecutionError {
name: string;
message: string;
stack?: string;
/** Module specifier that failed to link, when applicable. */
specifier?: string;
/** Source filename associated with line and column, when available. */
filename?: string;
/** 1-based line in source, when available. */
line?: number;
/** 1-based column in source, when available. */
column?: number;
}
Errors raised by bridged functions are reflected into the sandbox as a regular thrown error. Host-side stack traces do not leak into the sandbox result.
4.3 CodeExecutionLog
interface CodeExecutionLog {
level: 'log' | 'info' | 'warn' | 'error' | 'debug';
args: unknown[];
timestamp: number;
}
Only populated when the runtime installs a capturing console (typically when the caller does not provide one via options.globals).
5. Result Channels
The sandbox has two channels for returning data. They coexist and can both be used in the same run.
5.1 Configured Export
The canonical success result is the module export selected by options.execute. By default this is the module’s default export:
export default { summary: 'ok', matches: 3 };
After module evaluation, the runtime resolves the configured export as follows:
- Read
module.namespace[execute.fn], whereexecute.fndefaults to'default'. - If it is a function (including an async function), call it with
execute.argsinside the sandbox. The result of that call is then processed by step 3. - If the current candidate is a
Promise(or thenable), await it. Repeat this step until the value is no longer a thenable. - Marshal the final value into
CodeExecutionResult.result.
All four of these produce the same surface for callers:
export default 42; // result === 42
export default async () => 42; // result === 42
export default () => Promise.resolve(42); // result === 42
export default Promise.resolve(42); // result === 42
Named exports are selected with execute.fn:
export function increment(n: number) {
return n + 1;
}
// state.runCode(source, { execute: { fn: 'increment', args: [100] } })
// result === 101
If the selected export is missing, the run settles with status: 'link_error'. If the selected export is not a function and execute.args is non-empty, the run settles with status: 'error'. If the selected export function throws (or its returned promise rejects), the run settles with status: 'error' and the thrown value in error.message — identical to any other uncaught runtime error.
5.2 Bridge Reports
For streaming or multi-value output, pass a report callback into the sandbox as a bridged function:
const run = state.runCode(source, {
imports: {
supervisor: {
report: (payload) => telemetry.push(payload),
},
},
});
Every report(x) call runs the host sink synchronously (or returns its promise).
When options.report is supplied, the runtime installs a built-in report global that forwards values to it, and pushes each reported value into CodeExecutionResult.reports in call order.
Use whichever channel fits: configured export for “this code computes one answer”; reports for “this code scans and flags N things.”
6. Limitations
These are the things your sandboxed source code cannot do. They are a design property of code execution, not a configuration knob.
6.1 No Host Capabilities on globalThis
The sandbox’s globalThis contains only ECMAScript intrinsics (Object, Array, Promise, Math, JSON, Map, Set, Date, RegExp, Error, typed arrays, BigInt, Symbol, Reflect, Proxy, structuredClone) plus globalThis, undefined, NaN, Infinity.
It does not contain process, global, window, self, document, require, Deno, Bun, fetch, Request, Response, URL, URLSearchParams, WebSocket, WebAssembly, crypto, setTimeout, setInterval, setImmediate, performance, atob, btoa, TextEncoder, or TextDecoder — unless the caller explicitly installed one via globals.
If your code needs any of those, bridge them in.
6.2 No Dynamic Code Loading
Inside the sandbox:
evalis absent or throws.Function,AsyncFunction, andGeneratorFunctionconstructors throw when called with source strings.- Dynamic
import(...)only resolves specifiers present inoptions.importsoroptions.modules; URL imports reject.
6.3 No Shared Memory With the Host
- No
SharedArrayBuffer, noAtomics, no postMessage-style channels. - Bridged values cross the boundary by deep clone (§3.5). A sandbox mutation of a received object does not affect the host-side source object.
6.4 No Runtime-Imposed Deadline
The spec does not enforce a wall-clock timeout. handle.terminate() signals the sandbox to stop at the next yield point — usually a few milliseconds, but a pure-synchronous tight loop may run until the runtime’s own safety cap fires.
Callers SHOULD:
- wrap untrusted code with their own
setTimeout+handle.terminate(), as shown in §2.2; and - where the run may be CPU-bound, treat the platform safety cap as the worst-case upper bound.
6.5 Memory Is Capped
If you set memoryLimitBytes, exceeding it settles the handle with status: 'memory'. If you omit it, a runtime-defined default applies. Plan allocations accordingly.
6.6 Timers Are Not Bridged Automatically
The sandbox has a working microtask queue — Promise, await, and queueMicrotask behave normally. But setTimeout / setInterval are absent unless the caller bridges them.
6.7 Nondeterministic Intrinsics Are Present
Date.now() and Math.random() work inside the sandbox and are nondeterministic. If you need determinism, bridge deterministic replacements through globals.
6.8 Each Run Is Fresh
Every call to runCode gets its own isolate. State — module caches, intrinsic mutations, top-level variables — does not carry over between runs. If you need persistence, write to thread storage through a bridged function.
7. TypeScript Support
When language is 'typescript' (the default):
- The runtime accepts TypeScript source.
- Types are erased before evaluation. Type errors are ignored — only syntax errors block linkage.
import type,export type,satisfies,as, generics, enums, and namespaces are all accepted at the erasure layer.tsconfig.jsonis not consulted. No path aliases, nopaths, nobaseUrl.
When language is 'javascript', the source is parsed as standard ECMAScript modules with no transformation.
8. Module Semantics
- Source is evaluated as an ES module.
import,export, and top-levelawaitare available. - There is one entry module per
runCodecall plus any caller-suppliedoptions.modules. importsentries are synthetic modules: their namespace is a frozen copy of the host-provided object.modulesentries are source-backed ES modules available by relative specifier.import.metaexposes only{ url: string }whereurlis a syntheticsandbox:<filename>URL. Host paths are not exposed.
9. Interaction With Thread State
runCode is a method on ThreadState. It runs on behalf of a thread, but the sandbox does not receive state implicitly. If you want the sandbox to read thread messages, files, or env, bridge the specific capabilities you want to expose:
await state.runCode(source, {
imports: {
thread: {
readFile: (path) => state.readFile(path),
getMessages: (opts) => state.getMessages(opts),
emit: (event, data) => state.emit(event, data),
},
},
});
This is intentional: bridges are a capability boundary. Leaking state would let sandboxed code invoke arbitrary tools, mutate thread env, or terminate the thread — the opposite of what isolation means.
Bridged functions run on the host side and can await thread operations, schedule effects, emit events, or call tools.
10. Usage Patterns
10.1 Single-Value Computation
const { result } = await state.runCode(
`
const n = input.reduce((a, b) => a + b, 0);
export default n;
`,
{ globals: { input: [1, 2, 3] } },
);
// result === 6
10.2 Flagging Multiple Items
const result = await state.runCode(
`
import { report } from 'supervisor';
for (const row of rows) {
if (row.flagged) report(row.id);
}
`,
{
globals: { rows: await loadRows() },
imports: {
supervisor: { report: (id) => flaggedIds.add(id) },
},
},
);
10.3 Running Model-Authored Code With a Deadline
const userCode = llmResponse.code; // untrusted, produced by an LLM
const run = state.runCode(userCode, {
memoryLimitBytes: 64 * 1024 * 1024,
globals: {
input: await state.readFile('/data/input.json'),
},
});
const deadline = setTimeout(() => run.terminate('2s budget'), 2_000);
const result = await run;
clearTimeout(deadline);
if (result.status !== 'success') {
// surface result.error to the model as a tool error
}
10.4 Composing With Tools
Custom tools can be thin wrappers around runCode:
export default defineTool({
description: 'Evaluate an expression against thread data',
args: z.object({ code: z.string() }),
execute: async (state, { code }) => {
const run = state.runCode(code, {
imports: {
thread: { readFile: (p) => state.readFile(p) },
},
});
const deadline = setTimeout(() => run.terminate('2s budget'), 2_000);
const result = await run;
clearTimeout(deadline);
return result.status === 'success'
? { status: 'success', result: JSON.stringify(result.result) }
: { status: 'error', error: result.error?.message ?? 'run failed' };
},
});
11. Security Considerations
- Untrusted source. Source from an LLM or remote user is untrusted. The sandbox and its limitations (§6) are the barriers against escape. Do not relax them based on source inspection.
- Bridged callables are the attack surface. Once a function crosses the boundary, the sandbox may call it repeatedly with any serializable arguments. Bridged functions MUST validate arguments against an explicit schema and SHOULD enforce their own call-rate budget when they perform expensive work.
- Host-side data exposure. Bridged functions run on the host and can reach any state the host closure captures. They MUST NOT return values that were not deliberately exposed — secrets, other threads’ state, and unrelated environment variables MUST NOT be reachable through a bridge unless the caller explicitly passed them.
- Runaway execution. Callers evaluating untrusted code MUST enforce their own deadline via
setTimeout+handle.terminate().terminate()settles at the next yield point; for CPU-bound loops, the platform safety cap is the effective worst case. - Memory exhaustion. Callers SHOULD set
memoryLimitBytesexplicitly when evaluating untrusted code. - Log redaction. Captured
logsmay contain values derived fromglobals. Treatlogswith the same confidentiality level as the most sensitive input passed intoglobalsorimports.
12. TypeScript Reference
interface CodeExecutionOptions {
execute?: {
fn?: string;
args?: unknown[];
};
imports?: Record<string, Record<string, unknown>>;
modules?: Record<string, string>;
globals?: Record<string, unknown>;
language?: 'javascript' | 'typescript';
memoryLimitBytes?: number;
filename?: string;
report?: (value: unknown) => void;
}
interface CodeExecution extends PromiseLike<CodeExecutionResult> {
terminate(reason?: string): void;
readonly running: boolean;
readonly reports: readonly unknown[];
}
interface CodeExecutionResult {
status: 'success' | 'error' | 'memory' | 'terminated' | 'link_error';
result?: unknown;
reports: unknown[];
logs: CodeExecutionLog[];
error?: CodeExecutionError;
durationMs: number;
memoryUsedBytes?: number;
}
interface CodeExecutionLog {
level: 'log' | 'info' | 'warn' | 'error' | 'debug';
args: unknown[];
timestamp: number;
}
interface CodeExecutionError {
name: string;
message: string;
stack?: string;
specifier?: string;
filename?: string;
line?: number;
column?: number;
}