Primitive 原语系统

Primitive 是 SimpleLLMFunc 的“运行时内置能力”（CodeAct 范式下的 builtin tool）。它不是直接暴露给模型的 @tool，而是通过 PyRepl 的 execute_code 间接调用：

LLM -> execute_code -> runtime.<pack>.<primitive>(...)

定位与作用

在 CodeAct 模式下，模型会使用 execute_code 执行一段 Python 代码。Primitive 就是这段代码里可直接调用的运行时 API，例如：

runtime.selfref.context.inspect()
runtime.github_repo.list_open_issues("owner/repo")

运行时能力入口

不需要 import，直接从 runtime 命名空间访问。

CodeAct 内置能力

由 PyRepl 托管，通过 execute_code 间接调用。

可发现、可理解

可通过 runtime.get_primitive_spec(...) 和 runtime.list_primitive_specs(...) 暴露给模型。

关键概念

PrimitivePack

PrimitivePack 是一组 runtime 原语的命名空间：

pack 名称决定 runtime.<pack_name>.<primitive_name> 前缀
pack 可绑定默认 backend
pack 可附带 guidance，用于描述整包能力的心智模型和边界
具体 primitive 的输入输出仍以 runtime.get_primitive_spec(...) 为准

Primitive

Primitive 是 pack 下面的运行时函数入口，形式如下：

runtime.<pack_name>.<primitive_name>(...)

它本质上是普通 Python 函数，但会额外接收一个 PrimitiveCallContext。

Backend

Backend 是承载真实能力的 Python 对象，例如 dict、service、client 或自定义类实例。通常通过 ctx.backend 或 ctx.get_backend(...) 注入到 primitive handler。

开发流程（推荐写法）

创建 pack

通过 repl.pack(name, backend=..., guidance="...") 创建命名空间和 backend 绑定。

通过 @pack.primitive(...) 注册一个或多个 runtime 原语。

安装到 PyRepl

通过 repl.install_pack(pack) 把这组能力挂到运行时环境中。

这条路径最适合需要命名空间、共享 backend、pack guidance 和 fork 生命周期控制的场景。

from SimpleLLMFunc.builtin import PyRepl

class GitHubRepoAPI:
    def list_open_issues(self, repo: str) -> list[dict[str, str]]:
        # In production, call GitHub REST/GraphQL here.
        return [
            {"id": "42", "title": "Bug: tool timeout", "repo": repo},
            {"id": "57", "title": "Docs: update primitive guide", "repo": repo},
        ]

    def get_issue(self, repo: str, issue_id: str) -> dict[str, str]:
        return {"id": issue_id, "title": "Example issue", "repo": repo}


repl = PyRepl()

github_repo = repl.pack(
    "github_repo",
    backend=GitHubRepoAPI(),
    guidance="github_repo = repository issue/query primitives backed by GitHubRepoAPI.",
)


@github_repo.primitive(
    "list_open_issues",
    description="List open issues from a GitHub repository.",
)
def list_open_issues(ctx, repo: str) -> list[dict[str, str]]:
    backend = ctx.backend
    if not isinstance(backend, GitHubRepoAPI):
        raise RuntimeError("backend must be a GitHubRepoAPI")
    return backend.list_open_issues(repo)


@github_repo.primitive(
    "get_issue",
    description="Read one issue by id from a GitHub repository.",
)
def get_issue(ctx, repo: str, issue_id: str) -> dict[str, str]:
    backend = ctx.backend
    if not isinstance(backend, GitHubRepoAPI):
        raise RuntimeError("backend must be a GitHubRepoAPI")
    return backend.get_issue(repo, issue_id)


repl.install_pack(github_repo)

# 在 execute_code 内调用：
# runtime.github_repo.list_open_issues("owner/repo")
# runtime.github_repo.get_issue("owner/repo", "42")

如果只是轻量扩展，也可以使用 @repl.primitive(...)：

repo_backend = GitHubRepoAPI()

repl.register_runtime_backend("github_repo", repo_backend, replace=True)


@repl.primitive("github_repo.list_open_issues", backend="github_repo", replace=True)
def list_open_issues(ctx, repo: str) -> list[dict[str, str]]:
    backend = ctx.get_backend("github_repo")
    if not isinstance(backend, GitHubRepoAPI):
        raise RuntimeError("backend must be a GitHubRepoAPI")
    return backend.list_open_issues(repo)

如果你在做更底层的宿主集成，也可以直接使用 PrimitiveRegistry.register(...)。不过对大多数使用 PyRepl 的场景来说，优先推荐 PrimitivePack 路径，因为它天然把 namespace、backend、guidance 和安装生命周期放在同一个抽象里。

在哪里被使用

PyRepl.execute_code

在运行时环境中提供 runtime.* 命名空间。

llm_chat(toolkit=repl.toolset)

模型需要先调用 execute_code，才能真正访问 primitives。

内置 selfref pack

runtime.selfref.context.* 和 runtime.selfref.fork.* 也是同一套机制的内置示例。

Primitive 上下文注入

每个 primitive handler 都会收到 PrimitiveCallContext，其中常用字段包括：

primitive_name
call_id / execution_id
event_emitter
repl / registry
backend_name
backend

上下文的注入流程：

worker -> PyRepl._execute_primitive_call -> PrimitiveRegistry.call
-> context.backend_name / context.backend 填充

推荐让 primitive handler 优先通过 ctx.backend 或 ctx.get_backend(...) 访问能力，而不是自己绕过上下文去查找依赖。这样更利于 fork、clone 和生命周期管理。

Contract 与可发现性

如何暴露给模型

Primitive 支持结构化契约：

runtime.get_primitive_spec(name)：读取单条契约
runtime.list_primitive_specs(...)：批量读取契约

默认格式为 XML，也可通过 format='dict' 获取结构化字段。

契约信息来自哪里

handler 的 docstring：Use、Input、Output、Parse、Parameters、Best Practices
PrimitiveContract
@primitive(...) 的显式参数

PrimitiveContract vs PrimitiveSpec

PrimitiveContract：作者侧定义时写入的结构化声明
PrimitiveSpec：运行时和模型实际看到的公开说明

PrimitiveContract vs PrimitiveSpec

PrimitiveContract：author-side 的契约定义，描述 primitive 的输入输出、参数、解析方式与 next steps
PrimitiveSpec：runtime/public 侧看到的最终 spec，已经带上注册名、backend 绑定与归一化后的 contract 字段

可以把它们理解为：

PrimitiveContract = 你在“定义 primitive”时写进去的结构化声明
PrimitiveSpec = 模型和运行时在“读取 primitive”时看到的最终公开说明

字段解析顺序是：

description / input_type / output_type / output_parsing：显式参数 > PrimitiveContract > docstring
parameters：显式参数 > PrimitiveContract > docstring > 签名推断
next_steps：显式参数 > PrimitiveContract > docstring
best_practices：来自 docstring，并且为必填项

因此最稳妥的写法是：

在 docstring 中完整写出 Use / Input / Output / Parse / Parameters / Best Practices
在确实需要覆盖或程序化生成字段时，再使用 PrimitiveContract 或 @primitive(...) 的显式参数

Best practices 与 docstring

Primitive 的最佳实践只来自 docstring，并且是必填项。缺失 Best Practices 会在注册时抛出错误。

Docstring 结构规范

Docstring 采用“段落 + 冒号”的简洁格式，大小写不敏感：

Use: / Input: / Output: / Parse:
Parameters:
Best Practices:

没有标题的首段会自动当作 Use。推荐的完整模板：

@pack.primitive("list_open_issues")
def list_open_issues(ctx, repo: str) -> list[dict[str, str]]:
    """
    Use: List open issues from a GitHub repository.
    Input: `repo: str` (format: owner/repo).
    Output: `list[dict]` with keys `id`, `title`, `repo`.
    Parse: Read only `id` + `title` unless you need full details.
    Parameters:
    - repo: Repository in owner/repo form.
    Best Practices:
    - If the list is long, call get_issue for only the top 3 IDs.
    - Avoid dumping full issue bodies into chat.
    """
    ...

Best Practices 依赖 docstring 解析，请把规则明确写在 docstring 中，而不是只放在注释或外部文档里。

这些最佳实践会进入 primitive contract，可以通过：

runtime.get_primitive_spec("github_repo.list_open_issues")
runtime.list_primitive_specs(contains="github_repo.")

来读取。

如何进入提示词

Primitive 本身不会直接注入 system prompt。系统实际注入的是工具的最佳实践，例如 execute_code 的 best practices。因此 Primitive 的规则通常通过两种方式进入模型上下文：

execute_code 工具最佳实践中提示模型去查询 primitive contract
模型运行时主动调用 runtime.get_primitive_spec(...) 或 runtime.selfref.guide()

如果你希望某条规则被强注入提示词，应该写到工具的 best_practices 或 prompt_injection_builder 里。

Backend 生命周期与 fork/clone

如果 backend 有状态，建议实现 RuntimePrimitiveBackend：

clone_for_fork(context=...)：fork child 时如何复制/共享 backend（默认共享）
on_install(repl)：安装时回调
on_close(repl)：REPL 关闭时回调（可做资源释放）

这让 fork 子 agent 时的 backend 行为更可控，也能保证状态清理。

Primitive vs Tool

Tool：@tool 暴露给模型的函数调用（OpenAI tool calling）
Primitive：CodeAct 运行时内置 API，通过 execute_code 间接调用

你可以把 Primitive 理解为“内置在运行时的 builtin tool”。

概览

入门

基础设施

开发体验

Agent 主体逻辑

工具与运行时

UI 与交互

集成与示例

定位与作用

运行时能力入口

CodeAct 内置能力

可发现、可理解

关键概念

开发流程（推荐写法）

在哪里被使用

Primitive 上下文注入

Contract 与可发现性

PrimitiveContract vs PrimitiveSpec

Best practices 与 docstring

Docstring 结构规范

如何进入提示词

Backend 生命周期与 fork/clone

Primitive vs Tool

概览

入门

基础设施

开发体验

Agent 主体逻辑

工具与运行时

UI 与交互

集成与示例

​定位与作用

运行时能力入口

CodeAct 内置能力

可发现、可理解

​关键概念

​开发流程（推荐写法）

​在哪里被使用

​Primitive 上下文注入

​Contract 与可发现性

​PrimitiveContract vs PrimitiveSpec

​Best practices 与 docstring

​Docstring 结构规范

​如何进入提示词

​Backend 生命周期与 fork/clone

​Primitive vs Tool

定位与作用

关键概念

开发流程（推荐写法）

在哪里被使用

Primitive 上下文注入

Contract 与可发现性

PrimitiveContract vs PrimitiveSpec

Best practices 与 docstring

Docstring 结构规范

如何进入提示词

Backend 生命周期与 fork/clone

Primitive vs Tool