Skip to main content

Documentation Index

Fetch the complete documentation index at: https://simplellmfunc.cn/llms.txt

Use this file to discover all available pages before exploring further.

Interfaces API Reference

LLM_Interface (ABC)

from SimpleLLMFunc.interface import LLM_Interface
Abstract base class for all model interfaces.

Constructor

LLM_Interface(
    api_key_pool: APIKeyPool,
    model_name: str,
    base_url: Optional[str] = None,
    context_window: int = 200_000,
)

Abstract Methods

MethodSignatureDescription
chatasync def chat(*, trace_id, stream=False, messages, timeout=None, **kwargs)Non-streaming call
chat_streamasync def chat_stream(*, trace_id, stream=True, messages, timeout=None, **kwargs) -> AsyncGeneratorStreaming call

Properties

PropertyTypeDescription
model_namestrModel identifier
base_urlstr | NoneAPI endpoint
context_windowintContext window size in tokens
api_key_poolAPIKeyPoolKey rotation pool

OpenAICompatible

from SimpleLLMFunc import OpenAICompatible
Adapter for OpenAI Chat Completions API and compatible endpoints.

Constructor

OpenAICompatible(
    api_key_pool: APIKeyPool,
    model_name: str,
    base_url: str,
    max_retries: int = 5,
    retry_delay: float = 1.0,
    rate_limit_capacity: int = 10,
    rate_limit_refill_rate: float = 1.0,
    context_window: int = 200_000,
)

Class Methods

MethodReturnsDescription
load_from_json_file(path)Dict[str, Dict[str, OpenAICompatible]]Load all models from provider.json

Instance Methods

MethodReturnsDescription
get_rate_limit_status()Dict[str, Any]Current rate limiter state
reset_rate_limit()NoneReset the token bucket

Example

from SimpleLLMFunc import OpenAICompatible, APIKeyPool

# From file
models = OpenAICompatible.load_from_json_file("provider.json")
llm = models["openrouter"]["openai/gpt-4o"]

# Direct
llm = OpenAICompatible(
    api_key_pool=APIKeyPool(api_keys=["sk-..."], provider_id="openai"),
    model_name="gpt-4o",
    base_url="https://api.openai.com/v1",
    rate_limit_capacity=20,
    rate_limit_refill_rate=5.0,
)

OpenAIResponsesCompatible

from SimpleLLMFunc import OpenAIResponsesCompatible
Adapter for OpenAI Responses API. Same constructor and loading pattern as OpenAICompatible.

Differences from OpenAICompatible

AspectOpenAICompatibleOpenAIResponsesCompatible
System promptmessages[0].role="system"instructions field
Streaming formatChat Completion chunksResponses stream events
Reasoning supportN/Areasoning={...} kwargs
Wire protocolChat Completions APIResponses API

Example

from SimpleLLMFunc import OpenAIResponsesCompatible, APIKeyPool

llm = OpenAIResponsesCompatible(
    api_key_pool=APIKeyPool(api_keys=["sk-..."], provider_id="openai"),
    model_name="gpt-4o",
    base_url="https://api.openai.com/v1",
)

@llm_chat(llm_interface=llm, reasoning_effort="high")
async def reasoning_agent(message: str, history: list | None = None):
    """Reason carefully about the question."""
    pass

APIKeyPool

from SimpleLLMFunc import APIKeyPool
Round-robin key rotation for load distribution.

Constructor

APIKeyPool(
    api_keys: List[str],
    provider_id: str,
)

Example

pool = APIKeyPool(
    api_keys=["sk-key-1", "sk-key-2", "sk-key-3"],
    provider_id="openrouter-gpt4o",
)
# Keys are rotated on each call automatically

TokenBucket

from SimpleLLMFunc import TokenBucket
Token bucket rate limiter.

Constructor

TokenBucket(
    capacity: int,
    refill_rate: float,
)
ParameterDescription
capacityMaximum tokens in the bucket
refill_rateTokens added per second
Used internally by OpenAICompatible. Exposed for custom implementations.