Operations and Observability
ModelFaucet 0.6.0 adds lightweight source-beta operations hooks for local and pilot deployments.
Request IDs
The API and Gateway return x-request-id on every response. If a caller sends x-request-id, ModelFaucet preserves it; otherwise the service generates one.
Error responses include the same ID:
{
"error": {
"code": "invalid_request",
"message": "Invalid session request.",
"request_id": "req_example"
}
}Use this ID when tracing SDK calls, Gateway requests, provider failures, wallet errors, and ledger writes.
Readiness
GET /readyThe API reports database configuration and Gateway base URL. The Gateway reports repository configuration. These endpoints are intentionally shallow and safe for container readiness probes.
Metrics
GET /metricsThe API and Gateway expose Prometheus-style text metrics:
modelfaucet_http_requests_total{service="@modelfaucet/gateway",method="POST",route="/v1/chat/completions",status="200"} 1
modelfaucet_http_request_duration_ms_sum{service="@modelfaucet/gateway",method="POST",route="/v1/chat/completions",status="200"} 25.000
modelfaucet_rate_limited_total{service="@modelfaucet/gateway",route="/v1/chat/completions"} 1These in-process metrics reset on restart. Production deployments should scrape them into Prometheus, OpenTelemetry Collector, or the platform's metrics backend.
Rate Limits
The API and Gateway include an in-memory IP+route rate limiter. Defaults are wide enough for local smoke tests:
API_RATE_LIMIT_MAX_REQUESTS=1200
API_RATE_LIMIT_WINDOW_MS=60000
GATEWAY_RATE_LIMIT_MAX_REQUESTS=1200
GATEWAY_RATE_LIMIT_WINDOW_MS=60000Set *_MAX_REQUESTS=0 to disable the limiter in a trusted local environment. Hosted deployments should replace this with Redis or an edge/service-mesh rate limiter for multi-instance consistency.
Migration Rollback
The current source tree uses idempotent SQL in infra/db/schema.sql.
Rollback procedure for a failed migration attempt:
- Stop API, Gateway, workers, Dashboard, and demo traffic.
- Take a fresh database backup before changing anything.
- Restore the last known-good backup into a staging database.
- Run
pnpm db:migrateandpnpm db:seedagainst staging. - Run
pnpm smoke:localagainst staging. - Promote the restored database or apply a forward-only fix.
Avoid destructive ad hoc SQL on production. Prefer forward-only corrective migrations with a tested restore path.
Backup and Restore
Development backup:
pg_dump "$DATABASE_URL" > modelfaucet-dev-backup.sqlDevelopment restore:
dropdb modelfaucet_restore
createdb modelfaucet_restore
psql postgresql://localhost/modelfaucet_restore < modelfaucet-dev-backup.sql
DATABASE_URL=postgresql://localhost/modelfaucet_restore pnpm smoke:localProduction deployments should use managed automated backups, point-in-time recovery, encrypted snapshots, restore drills, and retention settings that match the deployment's compliance requirements.
