Skip to main content

Documentation Index

Fetch the complete documentation index at: https://simplellmfunc.cn/llms.txt

Use this file to discover all available pages before exploring further.

Harness Patterns

A “harness” is the code around your @llm_chat agent that manages state, context planning, and orchestration. The framework gives you the ReAct loop; you build the harness that makes it production-ready.

Core Philosophy

An agent is not a person. It is a method for constructing the right context for each reasoning step.
The harness engineer’s job:
  1. Ensure the model sees the shortest complete context at every step
  2. Persist state so sessions can be resumed deterministically
  3. Encode lessons into the environment, not into operator memory

Pattern 1: TUI General Agent

The canonical pattern from examples/tui_general_agent_example.py:
repl = PyRepl(working_directory=workspace)
file_tools = FileToolset(workspace).toolset

@llm_chat(
    llm_interface=llm,
    toolkit=[*repl.toolset, *file_tools],
    stream=True,
    self_reference_key=MEMORY_KEY,
    temperature=1.0,
)
async def core_agent(message: str, history: HistoryList):
    """
    {system_prompt_here}
    {environment_block}
    """
    pass


@tui(custom_event_hook=[debug_hook])
async def agent(message: str, history=None, _abort_signal=None):
    prepared_message = inject_compaction_if_needed(message)
    prepared_history = prepare_history(history)
    template_params = build_template_params()

    async for output in core_agent(
        message=prepared_message,
        history=prepared_history,
        _template_params=template_params,
        _abort_signal=_abort_signal,
    ):
        yield output
The harness layer (agent) wraps the core agent to:
  • Inject compaction instructions when context is large
  • Build dynamic template parameters (environment detection, workspace info)
  • Prepare history format
  • Route abort signals
  • Connect to the TUI

Pattern 2: Context Window Management

COMPACTION_THRESHOLD = 0.2  # Compact when 20% through context window

def _should_request_compaction() -> bool:
    total_tokens = llm.input_token_count + llm.output_token_count
    context_window = llm.context_window
    return total_tokens > context_window * COMPACTION_THRESHOLD


def prepare_user_message(message: str) -> str:
    if not _should_request_compaction():
        return message

    compaction_instruction = (
        "After finishing this task, call runtime.selfref.context.compact(...) "
        "to checkpoint your context."
    )
    return f"{message}\n\n{compaction_instruction}"
This pattern keeps context fresh without manual intervention.

Pattern 3: Dynamic Environment Block

Inject runtime context via template parameters:
def build_environment_block(workspace: Path) -> str:
    git_branch = run_command(["git", "branch", "--show-current"], workspace)
    git_status = run_command(["git", "status", "--porcelain"], workspace)

    return f"""
# Environment
- Workspace: {workspace}
- Git branch: {git_branch}
- Modified files: {len(git_status.splitlines())}
- Platform: {sys.platform}
- Python: {sys.version_info.major}.{sys.version_info.minor}
"""


TEMPLATE_PARAMS = {"environment_block": build_environment_block(workspace)}

Pattern 4: Supervisor Agent

An outer agent that delegates to specialized inner agents:
@llm_chat(llm_interface=llm, toolkit=[delegate_to_coder, delegate_to_reviewer])
async def supervisor(task: str, history: list | None = None):
    """
    Route tasks to the appropriate specialist.
    Use delegate_to_coder for implementation work.
    Use delegate_to_reviewer for code review.
    Synthesize results before responding.
    """
    pass


@tool
async def delegate_to_coder(task: str, context: str) -> str:
    """Delegate a coding task to the implementation specialist."""
    result = []
    async for output in coder_agent(f"{task}\n\nContext: {context}", []):
        if is_response_yield(output):
            result.append(output.response)
    return "\n".join(result)
The supervisor pattern enables:
  • Different models for different tasks (cheap model routes, expensive model implements)
  • Isolated context per specialist (each starts fresh)
  • Clear delegation boundaries

Pattern 5: Session Persistence

Persist history externally for resume:
import json
from pathlib import Path

SESSION_DIR = Path("~/.myagent/sessions").expanduser()


def save_session(session_id: str, history: list, metadata: dict):
    SESSION_DIR.mkdir(parents=True, exist_ok=True)
    path = SESSION_DIR / f"{session_id}.json"
    path.write_text(json.dumps({
        "history": history,
        "metadata": metadata,
    }, ensure_ascii=False))


def load_session(session_id: str) -> tuple[list, dict] | None:
    path = SESSION_DIR / f"{session_id}.json"
    if not path.exists():
        return None
    data = json.loads(path.read_text())
    return data["history"], data["metadata"]

Pattern 6: Custom Tool Runtime Override

Override which tools the agent sees based on context:
from SimpleLLMFunc.runtime.selfref import SELF_REFERENCE_TOOLKIT_OVERRIDE_TEMPLATE_PARAM


def build_runtime_toolkit(workspace: Path) -> list:
    """Build toolkit dynamically based on workspace state."""
    tools = [*repl.toolset]

    if (workspace / "package.json").exists():
        tools.extend(node_tools)
    elif (workspace / "pyproject.toml").exists():
        tools.extend(python_tools)

    tools.extend(FileToolset(workspace).toolset)
    return tools


template_params = {
    SELF_REFERENCE_TOOLKIT_OVERRIDE_TEMPLATE_PARAM: build_runtime_toolkit(workspace),
}

The Key Insight

The harness is where context engineering happens:
  • What the model sees = what you put in the template params + history + tool results
  • When the model forgets = when you fail to compact or persist
  • Why the model fails = usually missing context, not missing capability
Build harnesses that make the right context inevitable, not optional.