跳转到主要内容

Documentation Index

Fetch the complete documentation index at: https://simplellmfunc.cn/llms.txt

Use this file to discover all available pages before exploring further.

编排模式

“Harness”是围绕 @llm_chat Agent 的代码,负责管理状态、上下文规划和编排。框架为你提供 ReAct 循环;你来构建让它达到生产级别的 Harness。

核心理念

Agent 不是人。它是一种为每个推理步骤构建正确上下文的方法。
Harness 工程师的职责:
  1. 确保模型在每一步都看到最短且完整的上下文
  2. 持久化状态,使会话可以确定性地恢复
  3. 将经验编码到环境中,而不是操作者的记忆中

模式 1:TUI 通用 Agent

来自 examples/tui_general_agent_example.py 的标准模式:
repl = PyRepl(working_directory=workspace)
file_tools = FileToolset(workspace).toolset

@llm_chat(
    llm_interface=llm,
    toolkit=[*repl.toolset, *file_tools],
    stream=True,
    self_reference_key=MEMORY_KEY,
    temperature=1.0,
)
async def core_agent(message: str, history: HistoryList):
    """
    {system_prompt_here}
    {environment_block}
    """
    pass


@tui(custom_event_hook=[debug_hook])
async def agent(message: str, history=None, _abort_signal=None):
    prepared_message = inject_compaction_if_needed(message)
    prepared_history = prepare_history(history)
    template_params = build_template_params()

    async for output in core_agent(
        message=prepared_message,
        history=prepared_history,
        _template_params=template_params,
        _abort_signal=_abort_signal,
    ):
        yield output
Harness 层(agent)包装核心 Agent 以实现:
  • 当上下文过大时注入压缩指令
  • 构建动态模板参数(环境检测、工作区信息)
  • 准备历史格式
  • 路由中止信号
  • 连接到 TUI

模式 2:上下文窗口管理

COMPACTION_THRESHOLD = 0.2  # 当使用量达到上下文窗口的 20% 时进行压缩

def _should_request_compaction() -> bool:
    total_tokens = llm.input_token_count + llm.output_token_count
    context_window = llm.context_window
    return total_tokens > context_window * COMPACTION_THRESHOLD


def prepare_user_message(message: str) -> str:
    if not _should_request_compaction():
        return message

    compaction_instruction = (
        "After finishing this task, call runtime.selfref.context.compact(...) "
        "to checkpoint your context."
    )
    return f"{message}\n\n{compaction_instruction}"
该模式无需手动干预即可保持上下文精简。

模式 3:动态环境块

通过模板参数注入运行时上下文:
def build_environment_block(workspace: Path) -> str:
    git_branch = run_command(["git", "branch", "--show-current"], workspace)
    git_status = run_command(["git", "status", "--porcelain"], workspace)

    return f"""
# Environment
- Workspace: {workspace}
- Git branch: {git_branch}
- Modified files: {len(git_status.splitlines())}
- Platform: {sys.platform}
- Python: {sys.version_info.major}.{sys.version_info.minor}
"""


TEMPLATE_PARAMS = {"environment_block": build_environment_block(workspace)}

模式 4:Supervisor Agent

由外层 Agent 将任务分派给专门的内层 Agent:
@llm_chat(llm_interface=llm, toolkit=[delegate_to_coder, delegate_to_reviewer])
async def supervisor(task: str, history: list | None = None):
    """
    Route tasks to the appropriate specialist.
    Use delegate_to_coder for implementation work.
    Use delegate_to_reviewer for code review.
    Synthesize results before responding.
    """
    pass


@tool
async def delegate_to_coder(task: str, context: str) -> str:
    """Delegate a coding task to the implementation specialist."""
    result = []
    async for output in coder_agent(f"{task}\n\nContext: {context}", []):
        if is_response_yield(output):
            result.append(output.response)
    return "\n".join(result)
Supervisor 模式的优势:
  • 不同任务使用不同模型(廉价模型负责路由,高性能模型负责实现)
  • 每个专家有独立的上下文(各自从零开始)
  • 清晰的委托边界

模式 5:会话持久化

将历史记录持久化到外部存储以便恢复:
import json
from pathlib import Path

SESSION_DIR = Path("~/.myagent/sessions").expanduser()


def save_session(session_id: str, history: list, metadata: dict):
    SESSION_DIR.mkdir(parents=True, exist_ok=True)
    path = SESSION_DIR / f"{session_id}.json"
    path.write_text(json.dumps({
        "history": history,
        "metadata": metadata,
    }, ensure_ascii=False))


def load_session(session_id: str) -> tuple[list, dict] | None:
    path = SESSION_DIR / f"{session_id}.json"
    if not path.exists():
        return None
    data = json.loads(path.read_text())
    return data["history"], data["metadata"]

模式 6:自定义工具运行时覆盖

根据上下文动态调整 Agent 可见的工具集:
from SimpleLLMFunc.runtime.selfref import SELF_REFERENCE_TOOLKIT_OVERRIDE_TEMPLATE_PARAM


def build_runtime_toolkit(workspace: Path) -> list:
    """Build toolkit dynamically based on workspace state."""
    tools = [*repl.toolset]

    if (workspace / "package.json").exists():
        tools.extend(node_tools)
    elif (workspace / "pyproject.toml").exists():
        tools.extend(python_tools)

    tools.extend(FileToolset(workspace).toolset)
    return tools


template_params = {
    SELF_REFERENCE_TOOLKIT_OVERRIDE_TEMPLATE_PARAM: build_runtime_toolkit(workspace),
}

核心洞察

Harness 是上下文工程发生的地方:
  • 模型看到的 = 你放入模板参数、历史记录和工具结果中的内容
  • 模型遗忘的时刻 = 你未能进行压缩或持久化的时刻
  • 模型失败的原因 = 通常是缺少上下文,而非缺少能力
构建让正确上下文成为必然而非可选的 Harness。