Skip to main content

Documentation Index

Fetch the complete documentation index at: https://simplellmfunc.cn/llms.txt

Use this file to discover all available pages before exploring further.

Configuration

provider.json

The primary model configuration file. Defines available providers and models:
{
  "openrouter": [
    {
      "model_name": "openai/gpt-4o",
      "api_keys": ["sk-key-1", "sk-key-2"],
      "base_url": "https://openrouter.ai/api/v1",
      "api_params": {"reasoning_effort": "high"},
      "max_retries": 5,
      "retry_delay": 1.0,
      "rate_limit_capacity": 20,
      "rate_limit_refill_rate": 3.0
    },
    {
      "model_name": "anthropic/claude-3.5-sonnet",
      "api_keys": ["sk-key-1"],
      "base_url": "https://openrouter.ai/api/v1",
      "max_retries": 3,
      "retry_delay": 2.0,
      "rate_limit_capacity": 10,
      "rate_limit_refill_rate": 2.0
    }
  ],
  "local": [
    {
      "model_name": "llama-3.1-70b",
      "api_keys": ["not-needed"],
      "base_url": "http://localhost:8000/v1",
      "max_retries": 2,
      "retry_delay": 0.5,
      "rate_limit_capacity": 50,
      "rate_limit_refill_rate": 10.0
    }
  ]
}

Structure

  • Top level = provider ID → array of model configs
  • Lookup = providers[provider_id][model_name]
  • Multiple keys = load-balanced across keys in api_keys

Fields

FieldTypeRequiredDescription
model_namestringyesModel identifier for the provider
api_keysstring[]yesOne or more API keys (rotated)
base_urlstringyesAPI endpoint base URL
api_paramsobjectnoExtra default API kwargs for this model (for example reasoning_effort). Call-time kwargs override these values.
max_retriesintnoRetry count on transient failures. Default: 5
retry_delayfloatnoSeconds between retries. Default: 1.0
rate_limit_capacityintnoToken bucket capacity. Default: 10
rate_limit_refill_ratefloatnoTokens per second refill. Default: 1.0

Loading

from SimpleLLMFunc import OpenAICompatible

models = OpenAICompatible.load_from_json_file("provider.json")
llm = models["openrouter"]["openai/gpt-4o"]
Or for Responses API:
from SimpleLLMFunc import OpenAIResponsesCompatible

models = OpenAIResponsesCompatible.load_from_json_file("provider.json")
llm = models["openrouter"]["openai/gpt-4o"]

Best Practices

  • Keep one provider.json per project (version control it minus the keys)
  • Put multiple keys in api_keys for hot models (automatic rotation)
  • Use api_params for stable per-model defaults such as reasoning_effort; pass call-time kwargs when you need a one-off override
  • Tune rate_limit_capacity and rate_limit_refill_rate per model based on your tier
  • Use max_retries=2 for local models (fast failure), max_retries=5 for cloud (transient errors)

Environment Variables (.env)

The framework reads .env for logging and observability:
# Logging
LOG_LEVEL=WARNING          # DEBUG, INFO, WARNING, ERROR, CRITICAL
LOG_DIR=logs               # Directory for log files

# Langfuse (optional)
LANGFUSE_PUBLIC_KEY=your_public_key
LANGFUSE_SECRET_KEY=your_secret_key
LANGFUSE_BASE_URL=https://cloud.langfuse.com
LANGFUSE_EXPORT_ALL_SPANS=true
LANGFUSE_ENABLED=true

Precedence

Runtime environment → .env file → framework defaults
  • LOG_LEVEL=WARNING — reduces framework noise during normal use
  • LOG_DIR=logs — keeps logs out of your project root
  • Langfuse disabled by default — enable only when you need trace collection

Direct Construction (No Files)

For scripts and one-offs, skip provider.json entirely:
from SimpleLLMFunc import APIKeyPool, OpenAICompatible

llm = OpenAICompatible(
    api_key_pool=APIKeyPool(
        api_keys=["sk-your-key"],
        provider_id="openai",
    ),
    model_name="gpt-4o",
    base_url="https://api.openai.com/v1",
)
This is ideal for shell scripts, demos, and quick experiments.