.env environment variables and provider.json model configuration file.
Configuration Overview
.env environment variables
Manage log directory, log level, and runtime configuration such as Langfuse.
provider.json
Manage providers, models, API keys, retry and rate limiting parameters.
.env file
.env is used to store environment variables. This framework mainly reads log configuration and Langfuse configuration. You can create .env in the project working directory, or directly override it through system environment variables.
Log-related environment variables
Supported environment variables
| environment variable | explanation | Optional values | Default Value |
|---|---|---|---|
| LOG_LEVEL | Console and File Log Level | DEBUG, INFO, WARNING, ERROR, CRITICAL | DEBUG |
LOG_DIR | Log output directory | arbitrary path | logs |
Environment variable priority, from highest to lowest: runtime environment variables,
.env file, framework default values.provider.json file
provider.json is used to configure model providers, API keys, and rate limiting parameters. Both OpenAICompatible.load_from_json_file(...) and OpenAIResponsesCompatible.load_from_json_file(...) read this file and return a two-dimensional dictionary: providers[provider_id][model_name].
provider.json only describes runtime access concerns such as provider/model/key/base_url/retry/rate-limiting. Choosing OpenAICompatible vs OpenAIResponsesCompatible depends on whether your upstream endpoint is chat/completions-compatible or a Responses API endpoint.Configuration file structure
provider.json uses a provider -> list of model configs structure. Each model config contains model_name, API keys, retry settings, and rate limiting parameters.
Configuration parameter description
| Parameter | Type | Description | Example |
|---|---|---|---|
model_name | str | Model name used as the lookup key | gpt-3.5-turbo |
api_keys | list[str] | API key list with load balancing support | ["key1", "key2"] |
base_url | str | API server URL | https://api.openai.com/v1 |
max_retries | int | Maximum retry count | 5 |
retry_delay | float | Retry interval in seconds | 1.0 |
rate_limit_capacity | int | Token bucket capacity | 20 |
rate_limit_refill_rate | float | Token refill rate in tokens per second | 3.0 |
Loading and Usage
Best Practices
Multi-Key Load Balancing
Multi-Key Load Balancing
Configure multiple keys for the same model to reduce single key rate limit risk and allow
APIKeyPool to better distribute requests.Adjust rate limit parameters by model
Adjust rate limit parameters by model
Models with high cost or strict rate limiting should configure more conservative
rate_limit_capacity and rate_limit_refill_rate.Avoid duplicate model_name
Avoid duplicate model_name
Under the same provider,
model_name is used as the index key; if duplicated, the latter will override the former.