Embeddings¶
Generate vector embeddings for text input.
Endpoint: POST /api/v1/embeddings
Auth: Required
Request Body¶
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
text |
string or list |
Yes | — | Text(s) to embed |
model |
string |
No | Default embedding model | Model ID |
provider |
string |
No | — | Provider name |
additional_params |
object |
No | {} |
Provider-specific parameters |
byok_api_key |
string |
No | — | Your own provider API key |
Supported Providers¶
Embeddings are supported by: openai, mistral, google, cohere.
Examples¶
import requests
response = requests.post(
"https://api.indoxhub.com/api/v1/embeddings",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={
"model": "openai/text-embedding-3-small",
"text": "IndoxHub makes AI integration simple."
}
)
result = response.json()
print(f"Dimensions: {result['dimensions']}")
print(f"Vector: {result['data'][0][:5]}...")
const response = await fetch("https://api.indoxhub.com/api/v1/embeddings", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
body: JSON.stringify({
model: "openai/text-embedding-3-small",
text: "IndoxHub makes AI integration simple."
})
});
const data = await response.json();
console.log(`Dimensions: ${data.dimensions}`);
Batch Embeddings¶
Pass a list to embed multiple texts in one request:
response = requests.post(
"https://api.indoxhub.com/api/v1/embeddings",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={
"model": "openai/text-embedding-3-small",
"text": [
"First document to embed",
"Second document to embed",
"Third document to embed"
]
}
)
# response.json()["data"] is a list of embedding vectors
Response¶
{
"request_id": "550e8400-e29b-41d4-a716-446655440000",
"created_at": "2026-04-07T12:00:00Z",
"duration_ms": 120.3,
"provider": "openai",
"model": "text-embedding-3-small",
"success": true,
"message": "",
"data": [
[0.0123, -0.0456, 0.0789, "...1536 dimensions"]
],
"dimensions": 1536,
"usage": {
"tokens_prompt": 8,
"tokens_completion": 0,
"tokens_total": 8,
"cost": 0.00001,
"latency": 120.3
}
}
Response Caching
Embedding responses are cached when ENABLE_RESPONSE_CACHE is enabled, reducing latency and cost for repeated inputs.