Documentation Index
Fetch the complete documentation index at: https://simplellmfunc.cn/llms.txt
Use this file to discover all available pages before exploring further.
Custom Interface
SimpleLLMFunc ships with OpenAICompatible and OpenAIResponsesCompatible. For providers that don’t follow either protocol, implement LLM_Interface directly.
The Abstract Base
from SimpleLLMFunc.interface import LLM_Interface, APIKeyPool
class LLM_Interface(ABC):
def __init__(self, api_key_pool, model_name, base_url=None, context_window=200_000):
self.api_key_pool = api_key_pool
self.model_name = model_name
self.base_url = base_url
self.context_window = context_window
@abstractmethod
async def chat(self, *, trace_id, stream=False, messages, timeout=None, **kwargs):
"""Non-streaming call. Returns ChatCompletion-like object."""
...
@abstractmethod
async def chat_stream(self, *, trace_id, stream=True, messages, timeout=None, **kwargs):
"""Streaming call. Returns AsyncGenerator of ChatCompletionChunk-like objects."""
...
Minimal Implementation
from SimpleLLMFunc.interface import LLM_Interface, APIKeyPool
class MyProviderInterface(LLM_Interface):
def __init__(self, api_key_pool: APIKeyPool, model_name: str, base_url: str):
super().__init__(api_key_pool, model_name, base_url)
# Initialize your client here
async def chat(self, *, trace_id, stream=False, messages, timeout=None, **kwargs):
api_key = self.api_key_pool.get_key()
# Make your API call
response = await my_client.complete(
model=self.model_name,
messages=messages,
api_key=api_key,
)
# Return in ChatCompletion-compatible shape
return self._to_chat_completion(response)
async def chat_stream(self, *, trace_id, stream=True, messages, timeout=None, **kwargs):
api_key = self.api_key_pool.get_key()
async for chunk in my_client.stream(
model=self.model_name,
messages=messages,
api_key=api_key,
):
yield self._to_chunk(chunk)
Response Shape Requirements
The framework expects responses compatible with OpenAI’s types:
Non-streaming (chat)
Must return an object with:
.choices[0].message.content — text response
.choices[0].message.tool_calls — list of tool calls (optional)
.usage.prompt_tokens, .usage.completion_tokens — token counts
Streaming (chat_stream)
Must yield objects with:
.choices[0].delta.content — text delta (may be None)
.choices[0].delta.tool_calls — tool call deltas (optional)
.choices[0].finish_reason — “stop”, “tool_calls”, etc. (on final chunk)
If your provider supports tool calling, tool calls must be in this shape:
{
"id": "call_abc123",
"type": "function",
"function": {
"name": "tool_name",
"arguments": '{"param": "value"}' # JSON string
}
}
Using Your Interface
llm = MyProviderInterface(
api_key_pool=APIKeyPool(api_keys=["key1"], provider_id="my-provider"),
model_name="my-model",
base_url="https://api.myprovider.com/v1",
)
@llm_function(llm_interface=llm)
async def my_function(text: str) -> str:
"""Works with any LLM_Interface implementation."""
pass
OpenAIResponsesCompatible
For providers implementing OpenAI’s Responses API (not the standard Chat Completions API):
from SimpleLLMFunc import OpenAIResponsesCompatible, APIKeyPool
llm = OpenAIResponsesCompatible(
api_key_pool=APIKeyPool(api_keys=["sk-..."], provider_id="openai-responses"),
model_name="gpt-4o",
base_url="https://api.openai.com/v1",
)
The Responses adapter:
- Maps system prompts to Responses
instructions
- Handles Responses-specific streaming format
- Supports
reasoning={...} kwargs for reasoning effort control
- Keeps all wire-format differences in the adapter — your decorator code stays the same
→ API Reference: Interfaces