Provider Routing Beta
ModelFaucet 0.3.0 hardens the Gateway path used for cloud provider routing through LiteLLM and OpenAI-compatible endpoints.
What Changed
- Provider requests have configurable timeouts and retries.
- Retry attempts are recorded as sanitized metadata without Authorization headers or raw keys.
- Provider usage is reconciled when
prompt_tokens,completion_tokens, ortotal_tokensare missing or inconsistent. - The Gateway exposes
/health/providersfor operator checks. - BYOK and developer-key routes can fall back to platform routing when the stored credential or feature policy explicitly allows it.
stream: truerequests are rejected with a clear response until streaming ledger accounting is implemented.
Environment
LITELLM_BASE_URL=https://your-litellm.example
LITELLM_MASTER_KEY=<server-side-litellm-key>
PROVIDER_TIMEOUT_MS=30000
PROVIDER_MAX_RETRIES=1
PROVIDER_RETRY_DELAY_MS=250Provider keys must stay in server-side environment variables, a secret manager, or encrypted credential storage. They must not be passed through SDK options, React props, browser markup, or dashboard hidden inputs.
Health Check
curl http://localhost:3002/health/providersExample response:
{
"ok": true,
"providers": [
{
"ok": true,
"provider": "litellm",
"statusCode": 200,
"latencyMs": 12
}
]
}Real Provider Smoke
Use pnpm smoke:provider only when a real LiteLLM route is configured. This command starts the local Control API and Gateway, but it uses your server-side LITELLM_BASE_URL instead of the local mock provider.
export DATABASE_URL=postgresql://modelfaucet:modelfaucet@localhost:5432/modelfaucet
export SECRET_ENCRYPTION_KEY=dev_32_bytes_replace_me_replace_me
export LITELLM_BASE_URL=https://your-litellm.example
export LITELLM_MASTER_KEY=<server-side-litellm-key>
pnpm smoke:providerThe command refuses localhost or private-network LITELLM_BASE_URL values unless SMOKE_ALLOW_PRIVATE_PROVIDER=1 is set for a disposable local-only test.
Fallback Rules
Fallback to platform routing is allowed only when one of these is true:
- The selected BYOK or developer credential has
fallback_to_platform=true. - The feature policy sets
fallback_to_platform=true. - The feature policy sets
provider_fallbacktoplatformorplatform_pool.
Fallback is used only for provider failures. It does not bypass private-network URL protections, invalid sessions, expired sessions, insufficient wallet balance, or developer-key budget limits.
Streaming
stream: true currently returns:
{
"error": {
"code": "invalid_request",
"message": "Streaming responses are not enabled in this gateway release.",
"details": {
"streaming_supported": false
}
}
}This is intentional for 0.3.0; streaming requires first-class partial usage accounting and cancellation-safe ledger behavior.
