Skip to main content

Documentation Index

Fetch the complete documentation index at: https://simplellmfunc.cn/llms.txt

Use this file to discover all available pages before exploring further.

Custom Interface

SimpleLLMFunc ships with OpenAICompatible and OpenAIResponsesCompatible. For providers that don’t follow either protocol, implement LLM_Interface directly.

The Abstract Base

from SimpleLLMFunc.interface import LLM_Interface, APIKeyPool

class LLM_Interface(ABC):
    def __init__(self, api_key_pool, model_name, base_url=None, context_window=200_000):
        self.api_key_pool = api_key_pool
        self.model_name = model_name
        self.base_url = base_url
        self.context_window = context_window

    @abstractmethod
    async def chat(self, *, trace_id, stream=False, messages, timeout=None, **kwargs):
        """Non-streaming call. Returns ChatCompletion-like object."""
        ...

    @abstractmethod
    async def chat_stream(self, *, trace_id, stream=True, messages, timeout=None, **kwargs):
        """Streaming call. Returns AsyncGenerator of ChatCompletionChunk-like objects."""
        ...

Minimal Implementation

from SimpleLLMFunc.interface import LLM_Interface, APIKeyPool


class MyProviderInterface(LLM_Interface):
    def __init__(self, api_key_pool: APIKeyPool, model_name: str, base_url: str):
        super().__init__(api_key_pool, model_name, base_url)
        # Initialize your client here

    async def chat(self, *, trace_id, stream=False, messages, timeout=None, **kwargs):
        api_key = self.api_key_pool.get_key()
        # Make your API call
        response = await my_client.complete(
            model=self.model_name,
            messages=messages,
            api_key=api_key,
        )
        # Return in ChatCompletion-compatible shape
        return self._to_chat_completion(response)

    async def chat_stream(self, *, trace_id, stream=True, messages, timeout=None, **kwargs):
        api_key = self.api_key_pool.get_key()
        async for chunk in my_client.stream(
            model=self.model_name,
            messages=messages,
            api_key=api_key,
        ):
            yield self._to_chunk(chunk)

Response Shape Requirements

The framework expects responses compatible with OpenAI’s types:

Non-streaming (chat)

Must return an object with:
  • .choices[0].message.content — text response
  • .choices[0].message.tool_calls — list of tool calls (optional)
  • .usage.prompt_tokens, .usage.completion_tokens — token counts

Streaming (chat_stream)

Must yield objects with:
  • .choices[0].delta.content — text delta (may be None)
  • .choices[0].delta.tool_calls — tool call deltas (optional)
  • .choices[0].finish_reason — “stop”, “tool_calls”, etc. (on final chunk)

Tool Call Format

If your provider supports tool calling, tool calls must be in this shape:
{
    "id": "call_abc123",
    "type": "function",
    "function": {
        "name": "tool_name",
        "arguments": '{"param": "value"}'  # JSON string
    }
}

Using Your Interface

llm = MyProviderInterface(
    api_key_pool=APIKeyPool(api_keys=["key1"], provider_id="my-provider"),
    model_name="my-model",
    base_url="https://api.myprovider.com/v1",
)

@llm_function(llm_interface=llm)
async def my_function(text: str) -> str:
    """Works with any LLM_Interface implementation."""
    pass

OpenAIResponsesCompatible

For providers implementing OpenAI’s Responses API (not the standard Chat Completions API):
from SimpleLLMFunc import OpenAIResponsesCompatible, APIKeyPool

llm = OpenAIResponsesCompatible(
    api_key_pool=APIKeyPool(api_keys=["sk-..."], provider_id="openai-responses"),
    model_name="gpt-4o",
    base_url="https://api.openai.com/v1",
)
The Responses adapter:
  • Maps system prompts to Responses instructions
  • Handles Responses-specific streaming format
  • Supports reasoning={...} kwargs for reasoning effort control
  • Keeps all wire-format differences in the adapter — your decorator code stays the same
API Reference: Interfaces