Usage Tracking¶
Every API request is tracked with detailed usage metrics including tokens, cost, and latency.
Per-Request Tracking¶
Every response includes a usage object:
{
"usage": {
"tokens_prompt": 25,
"tokens_completion": 150,
"tokens_total": 175,
"cost": 0.0052,
"latency": 1200.5
}
}
What's Tracked¶
| Metric | Description |
|---|---|
tokens_prompt |
Input tokens sent to the model |
tokens_completion |
Output tokens generated by the model |
tokens_total |
Sum of prompt + completion tokens |
cost |
Cost in USD for this request |
latency |
Processing time in milliseconds |
cache_read_tokens |
Tokens served from response cache |
cache_write_tokens |
Tokens written to response cache |
reasoning_tokens |
Internal reasoning tokens (supported models) |
web_search_count |
Web searches triggered |
Credit System¶
- Each account has a credit balance in USD
- Credits are deducted based on the
costof each request - Minimum credit requirements vary by endpoint type
- BYOK requests do not consume credits
- Check your balance through the IndoxHub dashboard
In-stream usage events¶
Streaming responses emit the same totals inside the stream as a named
usage_final SSE event, so clients don't need a second API call to retrieve
billing data after a stream ends:
event: usage_final
data: {"type":"usage_final","input_tokens":15,"output_tokens":1,"cost_usd":2.85e-06,"latency_ms":4693}
The fields map 1:1 to the per-request usage object documented above
(input_tokens ≡ tokens_prompt, output_tokens ≡ tokens_completion,
cost_usd ≡ cost, latency_ms ≡ latency). See
SSE Events for the full per-event reference.
Storage¶
Usage data is stored in two layers:
- MongoDB: Detailed per-request logs with full request/response data
- PostgreSQL: Aggregated daily summaries for efficient reporting