Chat Completions¶
Generate conversational responses from AI models.
Endpoint: POST /api/v1/chat/completions
Auth: Required
Request Body¶
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
messages |
list[ChatMessage] |
Yes | — | Conversation messages (min 1) |
model |
string |
No | — | Model ID (e.g. openai/gpt-4o-mini) |
provider |
string |
No | — | Provider name (inferred from model) |
temperature |
float |
No | — | Sampling temperature |
max_tokens |
integer |
No | — | Max tokens in response |
top_p |
float |
No | — | Nucleus sampling |
frequency_penalty |
float |
No | — | Frequency penalty |
presence_penalty |
float |
No | — | Presence penalty |
stream |
boolean |
No | false |
Enable streaming |
additional_params |
object |
No | {} |
Provider-specific parameters |
byok_api_key |
string |
No | — | Your own provider API key |
ChatMessage¶
| Field | Type | Description |
|---|---|---|
role |
string |
system, user, or assistant |
content |
string or list |
Text content or multimodal content array |
Examples¶
import requests
response = requests.post(
"https://api.indoxhub.com/api/v1/chat/completions",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={
"model": "openai/gpt-4o-mini",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
"temperature": 0.7,
"max_tokens": 500
}
)
print(response.json()["data"])
const response = await fetch("https://api.indoxhub.com/api/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
body: JSON.stringify({
model: "openai/gpt-4o-mini",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain quantum computing in simple terms." }
],
temperature: 0.7,
max_tokens: 500
})
});
const data = await response.json();
console.log(data.data);
curl https://api.indoxhub.com/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
"temperature": 0.7,
"max_tokens": 500
}'
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.indoxhub.com/v1"
)
response = client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
temperature=0.7,
max_tokens=500
)
print(response.choices[0].message.content)
Response¶
{
"request_id": "550e8400-e29b-41d4-a716-446655440000",
"created_at": "2026-04-07T12:00:00Z",
"duration_ms": 1200.5,
"provider": "openai",
"model": "gpt-4o-mini",
"success": true,
"message": "",
"data": "Quantum computing uses quantum bits (qubits) that can exist in multiple states simultaneously...",
"finish_reason": "stop",
"reasoning_content": null,
"images": null,
"usage": {
"tokens_prompt": 25,
"tokens_completion": 150,
"tokens_total": 175,
"cost": 0.0052,
"latency": 1200.5,
"cache_read_tokens": 0,
"cache_write_tokens": 0,
"reasoning_tokens": 0,
"web_search_count": 0
}
}
Streaming¶
Set stream: true to receive Server-Sent Events (SSE):
import requests
response = requests.post(
"https://api.indoxhub.com/api/v1/chat/completions",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={
"model": "openai/gpt-4o-mini",
"messages": [{"role": "user", "content": "Tell me a story"}],
"stream": True
},
stream=True
)
for line in response.iter_lines():
if line:
text = line.decode("utf-8")
if text.startswith("data: ") and text != "data: [DONE]":
print(text[6:], end="", flush=True)
See Streaming for full details.
Reasoning Content¶
Models that support reasoning (e.g., DeepSeek-R1) return a reasoning_content field: