The Deepshi API is prepaid and usage-based. You add credits to your account, and each request draws down your balance based on the tokens it uses. For per-model rates, see Pricing.
Credits
- Your account holds a single credit balance, shared by all of your API keys.
- Credits are denominated in USD.
- Add credits from your Deepshi dashboard. Top-ups apply immediately.
- When your balance runs out, requests return
402 insufficient_quota until you top up.
- Failed requests (4xx) are not billed.
Because every key shares one balance, you can issue separate keys per app or
environment without splitting your credits across them.
How usage is priced
Pricing is per model, in USD per 1M tokens, billed as
prompt_tokens × input_rate + completion_tokens × output_rate. See Pricing for every model’s rates.
Knowing exactly what a request cost
Every successful response includes a usage object. Deepshi adds usage.cost.total_cost, the exact USD amount the request deducted from your balance:
"usage": {
"prompt_tokens": 22,
"prompt_tokens_details": { "cached_tokens": 0 },
"completion_tokens": 8,
"total_tokens": 30,
"cost": { "total_cost": 0.000135 }
}
When streaming, add "stream_options": {"include_usage": true} to receive a final chunk carrying the same usage (including cost). Standard OpenAI SDKs ignore the extra cost field, so it doesn’t break compatibility.
Checking your balance
Your current balance, spend, and per-request history are in the dashboard. Usage is metered in near real time; the displayed balance may lag actual usage by a few seconds under heavy load.
Running out of credits
When your balance is exhausted, requests fail with 402:
{
"error": {
"code": "insufficient_quota",
"message": "You have insufficient credits to complete this request.",
"param": null,
"type": "insufficient_quota"
}
}
Top up your balance to resume. See Errors & status codes for handling this in code.