PyRepl Code Execution

SimpleLLMFunc includes built-in PyRepl support, which lets LLMs execute Python code inside a persistent runtime context. Unlike one-shot execution, PyRepl keeps variables and state alive across calls, so the model can solve tasks step by step.

Features

Persistent Context

Variables persist across calls, which is useful for iterative analysis and coding tasks.

Async, Non-blocking

Execution runs off the main event loop so UI and event streaming stay responsive.

Real-time Event Output

Receive stdout, stderr, and input requests via event_emitter.

Session Isolation

Different PyRepl instances do not share state.

Long Output Truncation

Large outputs can be stored to a temporary file and returned in truncated form.

Runtime Primitives

Expose controlled runtime capabilities through runtime.selfref.* and related primitives.

Quick Start

Create a PyRepl instance

from SimpleLLMFunc.builtin import PyRepl

repl = PyRepl()
tools = repl.toolset

Attach it to llm_chat

from SimpleLLMFunc import llm_chat

@llm_chat(
    llm_interface=llm,
    toolkit=tools,
    enable_event=True,
)
async def python_assistant(message: str, history=None):
    """You are a Python coding assistant. Use code execution when it helps."""

Consume the output

async for output in python_assistant("Create a list and compute its mean"):
    pass

Multiple isolated sessions

repl1 = PyRepl()
repl2 = PyRepl()


@llm_chat(toolkit=repl1.toolset, ...)
async def chat1(message: str, history=None):
    """Assistant backed by repl1."""


@llm_chat(toolkit=repl2.toolset, ...)
async def chat2(message: str, history=None):
    """Assistant backed by repl2."""

Tool Details

execute_code

Execute Python code and return the execution result.

execute_code has a default 600-second active execution timeout. Time spent waiting for input() does not count toward that limit. Each input() request also has its own default 300-second idle timeout.

When output exceeds 20,000 tokens, execute_code stores the full result in a temporary file and returns only a truncated preview plus a <system-reminder> note with the file path.

The guidance sent to the model encourages it to inspect runtime primitives with runtime.list_primitives() and runtime.get_primitive_spec(...), and reminds it that reset_repl clears REPL variables but keeps registered runtime backends.

Parameters

Parameter	Type	Description
`code`	`str`	Python code to execute
`timeout_seconds`	`float`	Optional per-call timeout override
`event_emitter`	`ToolEventEmitter`	Optional emitter for realtime output

Tool output for the model The model receives a natural-language summary that includes execution status, elapsed time, stdout, stderr, return value, and error information. If you need structured data in Python code, call PyRepl.execute() directly. Python API return value

{
    "success": bool,
    "stdout": str,
    "stderr": str,
    "return_value": Any,
    "error": str | None,
    "error_details": dict | None,
    "execution_time_ms": float,
}

Improved error localization

execute_code tries to return information about the actual user code location rather than only the internal framework stack. Typical error_details fields:

error_type
message
line / column
snippet
pointer
summary
user_traceback

Example:

repl = PyRepl()
result = await repl.execute(code="for i in range(2)\n    print(i)")
if not result["success"]:
    print(result["error"])
    print(result["error_details"])

reset_repl

Reset the REPL state and clear all variables.

Model-facing tool descriptions make it explicit that reset_repl clears REPL variables but keeps the registered runtime backend.

result = await repl.reset()
# Returns: "REPL reset successfully. All variables have been cleared."

Streaming Events

When enable_event=True, execute_code can emit these realtime events:

Event name	Data shape	Description
`kernel_stdout`	`{text: str}`	Standard output
`kernel_stderr`	`{text: str}`	Standard error
`kernel_input_request`	`{request_id: str, prompt: str, idle_timeout_seconds: float}`	An `input()` request waiting for user input

Consuming streaming events

from SimpleLLMFunc.hooks import CustomEvent, is_event_yield

async for output in llm_chat_function(message):
    if is_event_yield(output):
        event = output.event
        if isinstance(event, CustomEvent):
            if event.event_name == "kernel_stdout":
                print(f"[stdout] {event.data['text']}", end="")
            elif event.event_name == "kernel_stderr":
                print(f"[stderr] {event.data['text']}", end="")

If the event name is kernel_input_request, you can reply by calling PyRepl.submit_input(request_id, value). When you consume event streams from @llm_chat(enable_event=True), output.origin can also be used to distinguish the main chain from forked sub-chains.

Usage Example

Data analysis assistant

import sys

from SimpleLLMFunc import llm_chat
from SimpleLLMFunc.builtin import PyRepl
from SimpleLLMFunc.hooks import CustomEvent, is_event_yield

repl = PyRepl()


@llm_chat(
    llm_interface=llm,
    toolkit=repl.toolset,
    enable_event=True,
)
async def data_helper(message: str, history=None):
    """You are a data analysis assistant. Use Python code in small steps and print useful results."""


async for output in data_helper("Create 100 random numbers and compute the mean and standard deviation"):
    if is_event_yield(output):
        event = output.event
        if isinstance(event, CustomEvent) and event.event_name == "kernel_stdout":
            print(event.data["text"], end="")

Persistent programming context

repl = PyRepl()

result1 = await repl.execute(code="""
import random
data = [random.randint(1, 100) for _ in range(10)]
print(f"Generated {len(data)} random values")
print(f"Data: {data}")
""")

result2 = await repl.execute(code="""
mean = sum(data) / len(data)
print(f"Mean: {mean}")
""")

print(result2["stdout"])

Configuration Options

# Default execution timeout is 600 seconds
repl = PyRepl()

# Override execution timeout
repl = PyRepl(execution_timeout_seconds=180)

# Override timeout per call
result = await repl.execute("import time\ntime.sleep(2)", timeout_seconds=5)

# Configure input idle timeout
repl = PyRepl(input_idle_timeout_seconds=300)

# Set working directory
repl = PyRepl(working_directory="./sandbox")

Runtime and Self-Reference

PyRepl installs the built-in selfref pack by default.

from SimpleLLMFunc import llm_chat
from SimpleLLMFunc.builtin import PyRepl, SelfReference

repl = PyRepl()
self_reference = repl.get_runtime_backend("selfref")
assert isinstance(self_reference, SelfReference)


@llm_chat(
    llm_interface=llm,
    toolkit=repl.toolset,
    self_reference_key="agent_main",
)
async def agent(message: str, history=None):
    ...

For custom runtime primitive design, see Runtime Primitives. Useful selfref primitives:

runtime.selfref.context.inspect()
runtime.selfref.context.remember(...)
runtime.selfref.context.forget(...)
runtime.selfref.context.compact(...)
runtime.selfref.fork.spawn(...)
runtime.selfref.fork.gather_all(...)

Key fork rules:

Child forks inherit the pre-fork context snapshot, not the parent’s pending fork tool-call scene.
runtime.selfref.fork.gather_all(...) returns dict[fork_id -> ForkResult].
Compact results expose status, response, result, memory_key, history_count, and history_included.
Check status first, then read response or result. Use include_history=True only when full child history is actually needed.

Best Practices

Use different PyRepl instances for different task scopes
Keep execution snippets small and inspect output between steps
Use event streaming for better UX in TUI or custom interfaces
Reset the REPL when you want a clean state without removing runtime backends

Overview

Getting Started

Infrastructure

Developer Experience

Agent Execution

Tools and Runtime

UI and Interaction

Integrations and Examples

Features

Persistent Context

Async, Non-blocking

Real-time Event Output

Session Isolation

Long Output Truncation

Runtime Primitives

Quick Start

Multiple isolated sessions

Tool Details

execute_code

Improved error localization

reset_repl

Streaming Events

Consuming streaming events

Usage Example

Data analysis assistant

Persistent programming context

Configuration Options

Runtime and Self-Reference

Best Practices

Overview

Getting Started

Infrastructure

Developer Experience

Agent Execution

Tools and Runtime

UI and Interaction

Integrations and Examples

​Features

Persistent Context

Async, Non-blocking

Real-time Event Output

Session Isolation

Long Output Truncation

Runtime Primitives

​Quick Start

​Multiple isolated sessions

​Tool Details

​execute_code

​Improved error localization

​reset_repl

​Streaming Events

​Consuming streaming events

​Usage Example

​Data analysis assistant

​Persistent programming context

​Configuration Options

​Runtime and Self-Reference

​Best Practices

Features

Quick Start

Multiple isolated sessions

Tool Details

execute_code

Improved error localization

reset_repl

Streaming Events

Consuming streaming events

Usage Example

Data analysis assistant

Persistent programming context

Configuration Options

Runtime and Self-Reference

Best Practices