Programmatic Tools

What this page covers

Why we offer two capabilities: programmatic Codemode and reusable Skills
How sandbox, codemode, and skills relate to each other
How the toggles behave in the UI/backend
When to use each mode, and when to combine them
Configuration levers (allow_direct_tool_calls, reranker hook) and their tradeoffs

Three distinct concepts

Sandbox, Codemode, and Skills are separate concepts with a clear dependency hierarchy.

Sandbox

A sandbox is an isolated code-execution environment. It runs independently of Codemode and Skills and can be used on its own for:

Pre-hooks — running setup code before an agent starts (e.g. injecting variables)
WebSocket status monitoring — the /configure/sandbox/ws endpoint reports sandbox availability
Standalone code runs — any toolset or hook that needs to execute Python

Two variants are supported:

eval — in-process Python execution (fast, default)
jupyter — dedicated Jupyter kernel per agent (isolated, full kernel state)

Set sandbox_variant in an agent spec to choose the variant. A sandbox is created eagerly at agent-creation time when sandbox_variant is specified, regardless of whether Codemode or Skills are enabled.

Codemode

Codemode lets the agent write Python to compose MCP tools programmatically (control flow, parallelism, error handling). Internally it builds tool bindings from the MCP registry and runs generated code via execute_code.

Codemode requires a sandbox. Without a sandbox there is nowhere to execute code. If no sandbox_variant is specified but Codemode is enabled, an eval sandbox is created automatically.

Skills

Skills are reusable workflows packaged with instructions, resources, and executable scripts. The AgentSkillsToolset exposes list_skills, load_skill, read_skill_resource, and run_skill_script.

Skills require a sandbox. Script execution (run_skill_script) runs inside a sandbox. If no sandbox_variant is specified but Skills are enabled, an eval sandbox is created automatically.

Dependency summary

Sandbox   ───────────────────► exists independently
               ▲
               │  requires
        ┌──────┴──────┐
     Codemode       Skills

Scenario	Sandbox	Codemode	Skills
Pre-hook only	✅ required	—	—
Status monitoring	✅ required	—	—
Codemode only	✅ auto-created (`eval`)	✅	—
Skills only	✅ auto-created (`eval`)	—	✅
Codemode + Skills	✅ shared	✅	✅
Sandbox only (jupyter)	✅ explicit	—	—

When both Codemode and Skills are enabled they share the same sandbox instance so variables and installed packages are visible across both.

Goals

Codemode: let the agent write Python to compose tools directly, reducing LLM tool-call overhead and enabling control flow, parallelism, and error handling.
Skills: package reusable workflows (instructions, resources, scripts) so an agent can load, read resources, and run scripts without writing new code each time.
Coexistence: you can enable Skills, Codemode, or both. Codemode provides its own tool discovery/execution; Skills surface curated capabilities.

Toggles and behavior

Enable Codemode: adds CodemodeToolset (list/search tools, execute Python code). When Codemode is on, selected MCP servers are used to build Codemode’s registry (tools are exposed via Codemode meta-tools rather than direct MCP tools).
Enable Skills: adds AgentSkillsToolset (skills discovery, load, resource read, script run). Uses the configured skills/ directory.
Both on: Skills are automatically wired into Codemode via wire_skills_into_codemode(). Skill operations become available as generated bindings (from generated.skills import ...) inside execute_code, alongside MCP tool bindings. The AgentSkillsToolset is still attached for direct tool access, but the primary interaction path is through Codemode's code execution.

Codemode toolset capabilities

list_tool_names (fast names-only, supports server, keywords, limit)
search_tools (returns tool definitions; can be reranked, includes deferred tools by default)
get_tool_details (full schema, output shape, and examples for one tool)
list_servers (connected MCP servers)
execute_code (Python in sandbox; import bindings from generated.mcp.<server>)
call_tool (single tool call) — optional, see below

Direct tool calls: `allow_direct_tool_calls`

When false (default in agent-runtimes codemode setup):
- call_tool is hidden; all execution goes through execute_code.
- Instructions emphasize code-first composition, mirroring the “write code to use tools” pattern.
When true (opt-in):
- call_tool is exposed as a convenience for single calls.
- Useful for simple queries where code would be overkill.
Choose false for stricter discipline and clearer audit of composed workflows; choose true for convenience at the cost of less enforced structure.

Search quality: reranker hook

search_tools accepts an optional async reranker: tool_reranker(tools, query, server) -> list[ToolDefinition].
Use it to reorder results (e.g., LLM-based relevance, business priority). Failures fall back to registry order.
This keeps discovery flexible without tying Codemode to any specific model provider.

Execution model

Codemode runs Python in an isolated sandbox (code-sandboxes). Agents import tool bindings from generated.mcp.<server_name>.
Use async/await, loops, conditionals, and asyncio.gather for parallelism.
Errors are caught and surfaced in the execute_code result payload, alongside stdout/stderr.

Skills model

Skills live in a skills/ directory with SKILL.md, resources, and scripts.
Tooling provided by AgentSkillsToolset: list_skills, load_skill, read_skill_resource, run_skill_script.
Skills can be authored as files or programmatic callables; they complement Codemode when you want curated, reusable behaviors.

UI & backend flow

UI checkboxes toggle Skills/Codemode. If Codemode is enabled, MCP server selection is still available and is used to scope Codemode discovery.
Backend route /api/v1/agents builds toolsets accordingly:
- sandbox_variant set → sandbox is created eagerly, independent of Codemode/Skills status.
- Skills enabled → eval sandbox auto-created (if no sandbox_variant); AgentSkillsToolset added.
- Codemode enabled → eval sandbox auto-created (if no sandbox_variant); CodemodeToolset added with allow_direct_tool_calls=False by default.
- Both enabled → both toolsets share the same sandbox, then skills are wired into Codemode via wire_skills_into_codemode().

Tool approvals mechanism

Tool approvals have two complementary paths:

Per-tool policy approvals for MCP/Skills (server + tool allowlists reflected in WebSocket snapshots).
Per-call deferred approvals for tools marked requires_approval=True (runtime stop-the-world approval/continue).

Both are designed to keep WebSocket state as the source of truth while minimizing unnecessary round-trips.

1) Policy approvals (MCP/Skills)

The UI tracks approved MCP/Skill tools via explicit allowlists (no implicit auto-approval).
Backend approval state is synced through snapshots/events and reflected in store fields such as approved_tools_by_server.
During execution, guardrails check these allowlists before requesting any new approval.

2) Deferred per-call approvals (`DeferredToolRequests`)

When a call requires runtime approval, pydantic-ai emits DeferredToolRequests with one or more ToolCallPart entries.

Agent-runtimes resolves these in two stages:

Inline fast path (second-pass behavior)
- ToolsGuardrailCapability.handle_deferred_tool_calls() inspects local approval records first.
- If a matching decision already exists (approved/rejected), it builds DeferredToolResults inline (using build_results() when available).
- Already-approved calls continue immediately in the same run; already-rejected calls map to ToolDenied.
Stop-the-world WebSocket path (unresolved only)
- Any unresolved approvals bubble out as DeferredToolRequests.
- The adapter requests approval via ToolApprovalManager, emits approval events, and waits for the user/reviewer decision.
- On decision, the adapter resumes the run with deferred_tool_results so only the unresolved subset is continued.

Event and state flow

Approval records are created/updated in the local approval store.
If AI-agents integration is enabled, decisions can be mirrored/relayed and then synced back.
WebSocket snapshots/events update frontend state (codemodeStatus / mcpStatus / fullContext) and drive the UI badges/toggles.

Why this hybrid model

Fast for known decisions: avoids repeated stop/resume cycles for calls that are already approved.
Safe for unknown decisions: unresolved calls still require explicit approval through the WebSocket workflow.
Consistent UX: frontend always reflects authoritative snapshot state rather than optimistic local merges.

Skills-in-Codemode wiring

When both Skills and Codemode are enabled, agent-runtimes automatically wires skills into Codemode so that skill operations are available as generated bindings inside execute_code. This is handled by wire_skills_into_codemode() in agent_factory.py.

What the wiring does

Generates skill bindings — Calls generate_skill_bindings() to create generated/mcp/skills/ with wrapper functions for list_skills, load_skill, read_skill_resource, and run_skill.
Sets a skill tool caller — Registers a callback on the executor that routes skills__* tool calls to the real AgentSkillsToolset.
Stores skills metadata — Calls set_skills_metadata() on the executor so that remote sandboxes (Datalayer Runtime, Jupyter kernel) can regenerate skill bindings inline.
Registers a skills proxy caller — For remote sandboxes, registers a proxy caller in mcp_proxy.py so that HTTP-routed skills__* calls reach the AgentSkillsToolset.

Post-init callback pattern

Since CodemodeToolset lazily initializes its executor, the wiring is deferred using a post-init callback:

# In rebuild_codemode (app.py / routes/agents.py)
if skills_enabled:
    codemode_toolset.add_post_init_callback(
        lambda ts: wire_skills_into_codemode(ts, skills_toolset)
    )

The callback fires once when the executor is first created (e.g., on the first execute_code call), ensuring the executor exists before binding generation and caller registration happen.

Using skills in execute_code

Once wired, agents can call skill operations from within execute_code:

execute_code(code='''
from generated.skills import list_skills, load_skill, run_skill
from generated.mcp.filesystem import read_file

# Discover available skills
skills = await list_skills({})
print(f"Skills: {[s['name'] for s in skills]}")

# Load a skill's instructions
instructions = await load_skill({"skill_name": "pdf-extractor"})
print(instructions)

# Run a skill script
result = await run_skill({
    "skill_name": "pdf-extractor",
    "script_name": "extract.py",
    "args": {"path": "/data/report.pdf"}
})
print(result)
''')

Remote sandbox support

When the code sandbox is remote (e.g., a Datalayer Runtime or Jupyter kernel), the skill bindings are generated inline inside the sandbox by _generate_tools_in_sandbox(). Tool calls from the remote sandbox make HTTP requests to agent-runtimes, where the skills proxy in mcp_proxy.py intercepts server_name == "skills" and routes the call to the registered AgentSkillsToolset.

Skills-first, Codemode glue (example)

Use Skills for curated, reusable steps and Codemode for orchestration:

# 1) Load skill instructions
load_skill("pdf-extractor")

# 2) Compose multi-step flow in code
execute_code(code='''
from generated.mcp.filesystem import read_file
from generated.skills import run_skill

doc = await read_file({"path": "/data/report.pdf"})
result = await run_skill({
    "skill_name": "pdf-extractor",
    "script_name": "extract.py",
    "args": {"path": "/data/report.pdf"}
})
print(result)
''')

Choosing a pattern

Use Skills when you want repeatable, reviewed workflows and clearer governance.
Use Codemode when you need ad-hoc composition, control flow, and efficiency.
Use both when you want curated capabilities (Skills) plus the ability to glue them together with code (Codemode). When both are enabled, skills are automatically wired into Codemode as generated bindings (from generated.skills import ...), giving agents a unified import pattern for all tools.

Configuration summary

allow_direct_tool_calls: hide or expose call_tool (defaults to false in agent-runtimes wiring).
tool_reranker: optional async hook to reorder search_tools results.
keywords/limit on list_tool_names: faster, filtered discovery.
include_deferred on search_tools/list_tool_names: control discovery of tools marked defer_loading.
max_tool_calls: optional per-run safety cap on tool invocations inside execute_code.

`allow_direct_tool_calls`

What it does: Controls whether the call_tool shortcut is exposed. When off, all tool usage flows through execute_code, which keeps execution auditable and code-first.
Default: false in agent-runtimes wiring (CodemodeToolset is instantiated with direct calls disabled).
When to set true: Simple, single-step calls where the overhead of writing code is unnecessary, or for rapid prototyping.
When to keep false: Shared/production agents, where you want consistent logging, fewer accidental direct calls, and clearer control over how tools are composed.

`tool_reranker`

What it does: Optional async hook tool_reranker(tools, query, server) -> list[ToolDefinition] that reorders search_tools results.
Why use it: Apply model-based relevance, business rules, or safety filters to discovery without changing the registry itself.
Behavior on error: If the hook raises or fails, Codemode falls back to registry order; search still returns results.
Usage tip: Log applied ordering when the hook is enabled to keep discovery transparent.

Safety and clarity tips

Prefer allow_direct_tool_calls=False in shared/production agents to keep a single, auditable execution path via execute_code.
When enabling reranking, log or trace the applied order for transparency.
Keep Skills concise and documented; use Codemode for bespoke multi-step tasks.

Recommended defaults (cookbook)

Production

enable_skills: true
enable_codemode: true
allow_direct_tool_calls: false
enable_tool_reranker: true (if you have a safe reranker configured)
max_tool_calls: set a conservative limit (e.g., 50–200) to prevent runaway loops

Prototyping

enable_skills: optional
enable_codemode: true
allow_direct_tool_calls: true
enable_tool_reranker: optional
max_tool_calls: unset or higher limit for exploration

MCP Servers

Agent Runtimes provides comprehensive support for MCP Servers, enabling agents to access external tools and data sources through a standardized interface.

Overview

MCP servers are external processes that provide tools, resources, and prompts to AI agents. Agent Runtimes supports two types of MCP server configurations:

MCP Config (from mcp.json)

MCP Config servers are user-defined servers configured in ~/.datalayer/mcp.json. These servers:

Start automatically when the agent runtime starts
Are fully customizable with your own commands, arguments, and environment variables
Appear in the agent form where users can select which servers to include as toolsets
Support any MCP-compatible server - if it follows the MCP specification, it will work
Are stored separately from catalog servers, allowing the same ID in both without conflict

MCP Catalog (predefined servers)

MCP Catalog servers are predefined server configurations included with Agent Runtimes. These servers:

Are NOT started automatically - users must explicitly enable them via API
Can be enabled on-demand using the /api/v1/mcp/servers/catalog/{server_name}/enable endpoint
Provide common tools like web search, file system access, etc.
Have their own storage separate from config servers

Which to use?

For most users, MCP Config is recommended. Add your servers to ~/.datalayer/mcp.json and they'll be available automatically when the agent runtime starts.

The same server ID can exist in both config and catalog - they are tracked independently.

Key Features

Automatic Server Lifecycle — Config servers start with the application and stop on shutdown
Retry with Backoff — Transient failures trigger automatic retries with exponential backoff
Sequential Startup — Multiple MCP servers start sequentially to avoid resource conflicts
Status Monitoring — Real-time status of all MCP toolsets via API endpoint
Separate Storage — Config and catalog servers are stored independently, allowing same IDs in both
Runtime Updates — Dynamically add/remove MCP servers from running agents via PATCH API

MCP Server Examples

Agent Runtimes supports any MCP-compatible server—if it follows the MCP specification, it will work. The table below shows a few popular examples to get you started:

Server	URL	Type	Description
Tavily	docs	Remote	Web search and content extraction
Filesystem	modelcontextprotocol/servers	Local	File system access
GitHub	github/github-mcp-server	Local	GitHub repository access
Google Workspace	taylorwilsdon/google_workspace_mcp	Local	Google Workspace (Gmail, Gdrive, etc.) access
Slack	datalayer/slack-mcp-server	Local	Slack workspace access
Kaggle	docs	Remote	Kaggle datasets, models, competitions, notebooks
AlphaVantage	docs	Local	Financial market data
Chart	antvis/mcp-server-chart	Local	Charting and visualization
Brave Search	modelcontextprotocol/servers	Local	Web search
LinkedIn	stickerdaniel/linkedin-mcp-server	Local	LinkedIn profile, company, and job data

Local vs Remote MCP Servers

Local servers run as child processes on your machine (started via npx or uvx)
Remote servers are hosted externally and accessed over HTTP (e.g., Kaggle MCP)

Both types are configured in the same mcp.json file, but remote servers use mcp-remote as a bridge.

See the MCP Servers Directory for more options.

Quick Start

Configuring MCP Config Servers

MCP Config servers are configured in ~/.datalayer/mcp.json. These servers start automatically when the agent runtime starts and appear in the agent creation form.

{
    "mcpServers": {
        "tavily-remote": {
            "command": "npx",
            "args": [
                "-y",
                "mcp-remote",
                "https://mcp.tavily.com/mcp/?tavilyApiKey=<your-api-key>"
            ]
        }
    }
}

Environment variables are automatically expanded using ${VAR_NAME} syntax.

Using MCP Tools in Agents

from pydantic_ai import Agent
from agent_runtimes.mcp import get_mcp_toolsets

# Get pre-loaded MCP toolsets
mcp_toolsets = get_mcp_toolsets()

# Create agent with MCP tools
agent = Agent(
        "anthropic:claude-sonnet-4-20250514",
        system_prompt="You are a helpful assistant.",
        toolsets=mcp_toolsets,
)

Full Configuration Example

{
    "mcpServers": {
        "tavily-remote": {
            "command": "npx",
            "args": [
                "-y",
                "mcp-remote",
                "https://mcp.tavily.com/mcp/?tavilyApiKey=<your-api-key>"
            ]
        },
        "filesystem": {
            "command": "npx",
            "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/allowed/dir"]
        },
        "linkedin": {
            "command": "uvx",
            "args": [
                "--from",
                "git+https://github.com/stickerdaniel/linkedin-mcp-server",
                "linkedin-mcp-server"
            ]
        },
        "kaggle": {
            "command": "npx",
            "args": [
                    "mcp-remote",
                    "https://www.kaggle.com/mcp",
                    "--header",
                    "Authorization: Bearer <KAGGLE_TOKEN>"
            ]
    }
}

Server-Specific Setup

Tavily MCP Server

The Tavily MCP Server provides web search and content extraction tools.

{
    "mcpServers": {
        "tavily-remote": {
            "command": "npx",
            "args": [
                "-y",
                "mcp-remote",
                "https://mcp.tavily.com/mcp/?tavilyApiKey=<your-api-key>"
            ]
        }
    }
}

Filesystem MCP Server

The Filesystem MCP Server provides tools for interacting with the local filesystem.

{
    "mcpServers": {
        "filesystem": {
            "command": "npx",
            "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/allowed/dir"]
        }
    }
}

GitHub MCP Server

The GitHub MCP Server provides tools for interacting with GitHub repositories.

{
    "mcpServers": {
        "github": {
            "command": "docker",
            "args": [
                "run",
                "-i",
                "--rm",
                "-e",
                "GITHUB_TOKEN",
                "ghcr.io/github/github-mcp-server"
            ],
            "env": {
                "GITHUB_TOKEN": "<GITHUB_TOKEN>"
            }
        }
    }
}

Google Workspace MCP Server

The Google Workspace MCP Server provides tools for interacting with Google Workspace services like Gmail and Google Drive.

{
    "mcpServers": {
        "google-workspace": {
            "command": "uvx",
            "args": ["workspace-mcp"],
            "env": {
                "GOOGLE_OAUTH_CLIENT_ID": "<your-client-id>",
                "GOOGLE_OAUTH_CLIENT_SECRET": "<your-client-secret>"
            }
        }
    }
}

Slack MCP Server

The Slack MCP Server provides tools for interacting with Slack workspaces.

{
    "mcpServers": {
        "slack": {
            "command": "npx",
            "args": ["-y", "@datalayer/slack-mcp-server"],
            "env": {
                "SLACK_BOT_TOKEN": "<your-slack-bot-token>",
                "SLACK_TEAM_ID": "<your-slack-team-id>",
                "SLACK_CHANNEL_IDS": "<your-slack-channel-ids>"
            }
        }
    }
}

Kaggle MCP Server

The Kaggle MCP Server is a remote HTTP server that provides access to Kaggle datasets, models, competitions, notebooks, and benchmarks.

Option 1: Token Authentication (recommended for Agent Runtimes)

{
    "mcpServers": {
        "kaggle": {
            "command": "npx",
            "args": [
                "-y",
                "mcp-remote",
                "https://www.kaggle.com/mcp",
                "--header",
                "Authorization: Bearer <KAGGLE_TOKEN>"
            ]
        }
    }
}

Option 2: Browser OAuth (auto-login)

{
    "mcpServers": {
        "kaggle": {
            "command": "npx",
            "args": ["-y", "mcp-remote", "https://www.kaggle.com/mcp"]
        }
    }
}

AlphaVantage MCP Server

The AlphaVantage MCP Server provides financial market data tools.

{
    "mcpServers": {
        "alphavantage": {
            "command": "uvx",
            "args": ["av-mcp==0.2.1", "<YOUR_API_KEY>"],
            "env": {"MAX_RESPONSE_TOKENS": "100000"}
        }
    }
}

Chart MCP Server

The Chart MCP Server provides charting and visualization tools.

{
    "mcpServers": {
        "chart": {
            "command": "npx",
            "args": ["-y", "@antv/mcp-server-chart"]
        }
    }
}

LinkedIn MCP Server

The LinkedIn MCP server requires browser automation via Playwright.

uvx --from playwright playwright install chromium

uvx --from git+https://github.com/stickerdaniel/linkedin-mcp-server linkedin-mcp-server --get-session

What this page covers​

Three distinct concepts​

Sandbox​

Codemode​

Skills​

Dependency summary​

Goals​

Toggles and behavior​

Codemode toolset capabilities​

Direct tool calls: allow_direct_tool_calls​

Search quality: reranker hook​

Execution model​

Skills model​

UI & backend flow​

Tool approvals mechanism​

1) Policy approvals (MCP/Skills)​

2) Deferred per-call approvals (DeferredToolRequests)​

Event and state flow​

Why this hybrid model​

Skills-in-Codemode wiring​

What the wiring does​

Post-init callback pattern​

Using skills in execute_code​

Remote sandbox support​

Skills-first, Codemode glue (example)​

Choosing a pattern​

Configuration summary​

allow_direct_tool_calls​

tool_reranker​

Safety and clarity tips​

Recommended defaults (cookbook)​

MCP Servers​

Overview​

MCP Config (from mcp.json)​

MCP Catalog (predefined servers)​

Key Features​

MCP Server Examples​

Quick Start​

Configuring MCP Config Servers​

Using MCP Tools in Agents​

Full Configuration Example​

Server-Specific Setup​

Tavily MCP Server​

Filesystem MCP Server​

GitHub MCP Server​

Google Workspace MCP Server​

Slack MCP Server​

Kaggle MCP Server​

AlphaVantage MCP Server​

Chart MCP Server​

LinkedIn MCP Server​