Documentation Index
Fetch the complete documentation index at: https://simplellmfunc.cn/llms.txt
Use this file to discover all available pages before exploring further.
Interfaces API Reference
LLM_Interface (ABC)
from SimpleLLMFunc.interface import LLM_Interface
Abstract base class for all model interfaces.
Constructor
LLM_Interface(
api_key_pool: APIKeyPool,
model_name: str,
base_url: Optional[str] = None,
context_window: int = 200_000,
)
Abstract Methods
| Method | Signature | Description |
|---|
chat | async def chat(*, trace_id, stream=False, messages, timeout=None, **kwargs) | Non-streaming call |
chat_stream | async def chat_stream(*, trace_id, stream=True, messages, timeout=None, **kwargs) -> AsyncGenerator | Streaming call |
Properties
| Property | Type | Description |
|---|
model_name | str | Model identifier |
base_url | str | None | API endpoint |
context_window | int | Context window size in tokens |
api_key_pool | APIKeyPool | Key rotation pool |
OpenAICompatible
from SimpleLLMFunc import OpenAICompatible
Adapter for OpenAI Chat Completions API and compatible endpoints.
Constructor
OpenAICompatible(
api_key_pool: APIKeyPool,
model_name: str,
base_url: str,
max_retries: int = 5,
retry_delay: float = 1.0,
rate_limit_capacity: int = 10,
rate_limit_refill_rate: float = 1.0,
context_window: int = 200_000,
)
Class Methods
| Method | Returns | Description |
|---|
load_from_json_file(path) | Dict[str, Dict[str, OpenAICompatible]] | Load all models from provider.json |
Instance Methods
| Method | Returns | Description |
|---|
get_rate_limit_status() | Dict[str, Any] | Current rate limiter state |
reset_rate_limit() | None | Reset the token bucket |
Example
from SimpleLLMFunc import OpenAICompatible, APIKeyPool
# From file
models = OpenAICompatible.load_from_json_file("provider.json")
llm = models["openrouter"]["openai/gpt-4o"]
# Direct
llm = OpenAICompatible(
api_key_pool=APIKeyPool(api_keys=["sk-..."], provider_id="openai"),
model_name="gpt-4o",
base_url="https://api.openai.com/v1",
rate_limit_capacity=20,
rate_limit_refill_rate=5.0,
)
OpenAIResponsesCompatible
from SimpleLLMFunc import OpenAIResponsesCompatible
Adapter for OpenAI Responses API. Same constructor and loading pattern as OpenAICompatible.
Differences from OpenAICompatible
| Aspect | OpenAICompatible | OpenAIResponsesCompatible |
|---|
| System prompt | messages[0].role="system" | instructions field |
| Streaming format | Chat Completion chunks | Responses stream events |
| Reasoning support | N/A | reasoning={...} kwargs |
| Wire protocol | Chat Completions API | Responses API |
Example
from SimpleLLMFunc import OpenAIResponsesCompatible, APIKeyPool
llm = OpenAIResponsesCompatible(
api_key_pool=APIKeyPool(api_keys=["sk-..."], provider_id="openai"),
model_name="gpt-4o",
base_url="https://api.openai.com/v1",
)
@llm_chat(llm_interface=llm, reasoning_effort="high")
async def reasoning_agent(message: str, history: list | None = None):
"""Reason carefully about the question."""
pass
APIKeyPool
from SimpleLLMFunc import APIKeyPool
Round-robin key rotation for load distribution.
Constructor
APIKeyPool(
api_keys: List[str],
provider_id: str,
)
Example
pool = APIKeyPool(
api_keys=["sk-key-1", "sk-key-2", "sk-key-3"],
provider_id="openrouter-gpt4o",
)
# Keys are rotated on each call automatically
TokenBucket
from SimpleLLMFunc import TokenBucket
Token bucket rate limiter.
Constructor
TokenBucket(
capacity: int,
refill_rate: float,
)
| Parameter | Description |
|---|
capacity | Maximum tokens in the bucket |
refill_rate | Tokens added per second |
Used internally by OpenAICompatible. Exposed for custom implementations.