Skip to main content

Documentation Index

Fetch the complete documentation index at: https://simplellmfunc.cn/llms.txt

Use this file to discover all available pages before exploring further.

SelfRef

SelfRef is SimpleLLMFunc’s system for durable, self-modifying context — letting agents remember across turns, compress their history, and delegate work to child agents.

Two Components, Sharp Separation

SelfReference (durable backend)

The stateful storage layer. Lives across invocations. Holds:
  • History per memory key — the full conversation transcript
  • Experiences — durable remembered facts/lessons
  • Summaries — compaction checkpoints
  • Fork state — child agent handles and results
Since 0.8.1, SelfReference is implemented as a small public facade over focused internal components:
ModuleResponsibility
state.pyPublic facade, lifecycle, compatibility exports
store.pyDurable history/source storage
active_turn.pyActive memory key, fork context, runtime toolkit, template params, active ReAct state
mutations.pyPending compaction/context/destructive mutation queues
memory_api.pyself_reference.memory[...] proxy and handle API
context_memory.pyContext snapshots, experience CRUD, compaction commit, direct memory editing
agent_binding.pyBound recursive agent callable state
fork_manager.pyFork/spawn/gather lifecycle and result materialization
fork_utils.pyFork helper functions and compatibility constants
This split does not change the public SelfReference API; it makes the durable backend easier to maintain and reason about.
from SimpleLLMFunc.builtin import SelfReference

selfref = SelfReference()
selfref.bind_history("agent_main", initial_history)

SelfRefSession (invocation-scoped plugin)

The per-call lifecycle adapter. Created fresh for each @llm_chat invocation. Implements ReAct hooks:
  • collect_context_mutations() — provides selfref-originated internal transcript patches before each compile
  • finalize() — persists final state back to the SelfReference backend after the turn ends
The session bridges the gap between the stateless ReAct loop and the stateful backend.

How They Connect

@llm_chat(self_reference_key="agent_main")


        ┌─ SelfRefSession (per invocation) ─────────────┐
        │                                                │
        │  Before each compile:                          │
        │    → collect_context_mutations()               │
        │    → may emit Experience/Summary patches       │
        │                                                │
        │  After turn ends:                              │
        │    → finalize()                                │
        │    → persist updated history to backend        │
        │                                                │
        └────────────────────────────────────────────────┘


        ┌─ SelfReference (durable backend) ─────────────┐
        │                                                │
        │  Memory["agent_main"]:                         │
        │    - history: [...]                            │
        │    - experiences: [{id, text}, ...]            │
        │    - summary: {...}                            │
        │                                                │
        └────────────────────────────────────────────────┘

DataFromSelfRef: The Snapshot

When SelfRef is active, the compile pipeline receives a DataFromSelfRef snapshot:
@dataclass(frozen=True)
class DataFromSelfRef:
    base_system_prompt: str          # System prompt (may include experience markers)
    experiences: List[Dict[str, str]]  # [{id: "...", text: "..."}, ...]
    summary: Optional[Dict[str, Any]]  # Compaction metadata
    summary_message: Optional[Dict[str, Any]]  # Summary as a message
    working_messages: NormalizedMessageList  # Post-compaction working transcript
This snapshot determines:
  • The system prompt (base + rendered experiences)
  • What messages the LLM sees (working_messages after compaction)
  • What experiences are active

The 6 Runtime Primitives

When a @llm_chat agent has SelfRef enabled and uses PyRepl, these primitives are available inside execute_code:

Context Primitives

PrimitiveWhat It Does
runtime.selfref.context.inspect()Returns read-only snapshot: active key, experiences, summary, messages, has_pending_compaction
runtime.selfref.context.remember(text)Stores a durable experience through an internal experience patch
runtime.selfref.context.forget(experience_id)Removes an experience through an internal experience patch
runtime.selfref.context.compact(goal, instruction, discoveries, completed, current_status, likely_next_work, relevant_files_directories, remember=[])Queues context compaction → later applies an internal summary patch

Fork Primitives

PrimitiveWhat It Does
runtime.selfref.fork.spawn(task, instruction, ...)Spawns a child agent with pre-fork context snapshot. Returns {fork_id, status}
runtime.selfref.fork.gather_all(include_history=False)Waits for all spawned children. Returns dict[fork_id → ForkResult]

Experience Lifecycle

Experiences are durable facts stored in the system prompt:
System Prompt:
  You are a coding agent...

  <experiences>
    <exp id="exp_001">User prefers dark mode terminal output</exp>
    <exp id="exp_002">Always run tests before committing</exp>
  </experiences>
  • remember("...") → records an experience through the runtime patch boundary
  • forget("exp_001") → removes it
  • Experiences survive compaction — they’re stored in the system prompt, not the working transcript
  • They’re rendered by render_system_prompt_with_experiences() during compile Stage 2

Compaction Lifecycle

When context grows too large, the agent can compact:
# Inside execute_code:
payload = runtime.selfref.context.compact(
    goal="Build the authentication module",
    instruction="Continue implementing OAuth flow",
    discoveries=["API requires PKCE", "Token refresh interval is 1h"],
    completed=["Set up project structure", "Added OAuth dependency"],
    current_status="Implementing token exchange endpoint",
    likely_next_work="Add refresh token logic, then write tests",
    relevant_files_directories=["src/auth/", "tests/test_auth.py"],
    remember=["OAuth endpoint requires PKCE challenge"]
)
What happens:
  1. Compaction is queued (not applied immediately)
  2. After the current tool batch completes, the runtime applies a summary patch
  3. The system prompt is preserved
  4. Working transcript is replaced with the summary message
  5. Items in remember become durable experiences
  6. SelfReference backend stores the new state

Fork Lifecycle

Forks let an agent delegate work to child agents that inherit context:
# Spawn (inside execute_code)
handle = runtime.selfref.fork.spawn(
    task="Review the auth module for security issues",
    instruction="Check for: SQL injection, XSS, auth bypass. Return findings as a list.",
)
print(handle["fork_id"])  # "fork_abc123"

# ... do other work ...

# Gather (when you need results)
results = runtime.selfref.fork.gather_all()
for fork_id, result in results.items():
    print(f"{fork_id}: {result['status']} - {result['response']}")
Key rules:
  • Children inherit the pre-fork context snapshot (not the parent’s in-flight state)
  • Children cannot modify the parent’s context
  • gather_all() blocks until all children complete
  • Results contain status, response, and optionally history

Activation

SelfRef is activated on @llm_chat via self_reference_key:
@llm_chat(
    llm_interface=llm,
    toolkit=[*repl.toolset, *file_tools],
    self_reference_key="agent_main",
)
async def agent(message: str, history: list):
    """Your agent prompt here."""
    pass
The framework automatically:
  1. Creates/retrieves the SelfReference backend for this key
  2. Wraps each invocation in a SelfRefSession
  3. Injects selfref primitives into the PyRepl runtime
  4. Handles finalization (persist updated history) after each turn

Building Guide: llm_chat

How to use @llm_chat with SelfRef in practice.

Advanced: SelfRef Engineering

Advanced patterns: multi-key memory, compaction strategies, fork orchestration.