Documentation Index
Fetch the complete documentation index at: https://simplellmfunc.cn/llms.txt
Use this file to discover all available pages before exploring further.
Compile Pipeline
The compile pipeline transforms invocation configuration, a base transcript, and pending runtime patches into the final messages sent to the LLM provider. It’s a two-stage process with a clean boundary between transcript patching and request rendering.The Two Stages
Stage 1: Reduce Turn Context
reduce_turn_context takes the current base transcript plus all pending runtime patches and produces a clean, patch-applied transcript.
-
Apply runtime patches —
apply_mutations(transcript, pending_mutations)processes each mutation in order:AssistantMessageMutation→ appends assistant messageToolResultMutation→ appends tool resultContextReplaceMutation→ replaces entire listContextSummaryMutation→ replaces with summary, stores experiencesExperienceRemember/Forget→ accumulated, committed before next non-experience mutation- etc.
-
Refresh selfref snapshot — If the transcript (after mutations) contains selfref markers (experiences, summaries), re-parse
DataFromSelfReffrom the transcript content. This ensures the snapshot reflects any compaction or experience mutations that just applied. - Clone — The result is a deep clone. No mutation of shared state.
Stage 2: Convert to LLM Request
convert_to_llm_request takes the reduced transcript plus the invocation’s prompt contract and produces the final messages for the provider.
-
Resolve system prompt — Priority order:
- If
selfref_snapshotexists → render base prompt + experiences block - Else if
prompt_contract.system_promptis set → use it directly - Else if transcript has a system message → extract its content
- Else if
prompt_contract.base_instructionexists → use docstring fallback
- If
- Place system message — Either replace the existing system message in the transcript, or prepend a new one. If no system prompt was resolved, remove any existing system message.
-
Render LLM messages —
render_llm_input_messages()finalizes the messages:- Prepends
<tool_best_practices>block if tools are mounted - Appends
<must_principles>block if required (tells model to use native tool calls) - Returns the final message list ready for the provider
- Prepends
Where This Runs in the ReAct Loop
The Single Entry Point
All compilation flows through one function:@llm_function and @llm_chat use this same entry point. There is no separate compilation path for different decorator modes.
Why Two Stages?
The split enables:- Stage 1 alone for internal state management (e.g.,
compile_contextin the ReAct loop updates state without rendering for the LLM) - Stage 2 adds provider-specific rendering (tool injection, system prompt placement) only when actually calling the LLM
- Testing — you can test mutation application separately from prompt rendering
- SelfRef refresh — happens between stages, ensuring the snapshot is current before prompt resolution
Practical Implications
For most users, this is invisible. You write a function, pass history, mount tools, and consume events. But when you need to debug internals:- Debug transcript issues → Check what runtime patches were produced and in what order
- Understand system prompt behavior → Know the priority order in Stage 2
- Build framework extensions → Use the compile boundary instead of mutating live messages directly
Next: SelfRef
How durable context (experiences, compaction, forking) works through the SelfReference system.