Skip to main content
This page explains the basic configuration of SimpleLLMFunc, focusing on .env environment variables and provider.json model configuration file.

Configuration Overview

.env environment variables

Manage log directory, log level, and runtime configuration such as Langfuse.

provider.json

Manage providers, models, API keys, retry and rate limiting parameters.

.env file

.env is used to store environment variables. This framework mainly reads log configuration and Langfuse configuration. You can create .env in the project working directory, or directly override it through system environment variables.
# Log level (default: DEBUG)
# Supported values: DEBUG, INFO, WARNING, ERROR, CRITICAL
LOG_LEVEL=DEBUG

# Log directory (default: logs)
LOG_DIR=logs

Supported environment variables

environment variableexplanationOptional valuesDefault Value
LOG_LEVELConsole and File Log LevelDEBUG, INFO, WARNING, ERROR, CRITICALDEBUG
LOG_DIRLog output directoryarbitrary pathlogs
Environment variable priority, from highest to lowest: runtime environment variables, .env file, framework default values.

provider.json file

provider.json is used to configure model providers, API keys, and rate limiting parameters. Both OpenAICompatible.load_from_json_file(...) and OpenAIResponsesCompatible.load_from_json_file(...) read this file and return a two-dimensional dictionary: providers[provider_id][model_name].
provider.json only describes runtime access concerns such as provider/model/key/base_url/retry/rate-limiting. Choosing OpenAICompatible vs OpenAIResponsesCompatible depends on whether your upstream endpoint is chat/completions-compatible or a Responses API endpoint.

Configuration file structure

provider.json uses a provider -> list of model configs structure. Each model config contains model_name, API keys, retry settings, and rate limiting parameters.
{
  "openai": [
    {
      "model_name": "gpt-3.5-turbo",
      "api_keys": ["sk-test-key-1", "sk-test-key-2"],
      "base_url": "https://api.openai.com/v1",
      "max_retries": 5,
      "retry_delay": 1.0,
      "rate_limit_capacity": 20,
      "rate_limit_refill_rate": 3.0
    },
    {
      "model_name": "gpt-4",
      "api_keys": ["sk-test-key-3"],
      "base_url": "https://api.openai.com/v1",
      "max_retries": 5,
      "retry_delay": 1.0,
      "rate_limit_capacity": 10,
      "rate_limit_refill_rate": 1.0
    }
  ],
  "zhipu": [
    {
      "model_name": "glm-4",
      "api_keys": ["zhipu-test-key-1", "zhipu-test-key-2"],
      "base_url": "https://open.bigmodel.cn/api/paas/v4/",
      "max_retries": 3,
      "retry_delay": 0.5,
      "rate_limit_capacity": 15,
      "rate_limit_refill_rate": 2.0
    }
  ]
}

Configuration parameter description

ParameterTypeDescriptionExample
model_namestrModel name used as the lookup keygpt-3.5-turbo
api_keyslist[str]API key list with load balancing support["key1", "key2"]
base_urlstrAPI server URLhttps://api.openai.com/v1
max_retriesintMaximum retry count5
retry_delayfloatRetry interval in seconds1.0
rate_limit_capacityintToken bucket capacity20
rate_limit_refill_ratefloatToken refill rate in tokens per second3.0

Loading and Usage

from SimpleLLMFunc import OpenAICompatible, OpenAIResponsesCompatible, llm_function

models = OpenAICompatible.load_from_json_file("provider.json")
gpt35 = models["openai"]["gpt-3.5-turbo"]
responses_models = OpenAIResponsesCompatible.load_from_json_file("provider.json")
gpt54 = responses_models["openrouter"]["gpt-5.4"]

@llm_function(llm_interface=gpt35)
async def my_task(text: str) -> str:
    """Process a text task."""
    pass

Best Practices

Configure multiple keys for the same model to reduce single key rate limit risk and allow APIKeyPool to better distribute requests.
Models with high cost or strict rate limiting should configure more conservative rate_limit_capacity and rate_limit_refill_rate.
Under the same provider, model_name is used as the index key; if duplicated, the latter will override the former.