# Pricing & Billing (/docs/pricing)


## Pricing Model [#pricing-model]

Yunxin uses a **credit-based** billing system. You purchase credits and consume them based on API usage.

## Credit-Based Billing [#credit-based-billing]

* **Purchase credits** from the Dashboard to fund your account
* Credits are consumed based on model usage, token count, and other parameters
* Different models have different credit costs — check the **Models** page in the dashboard for current rates
* Credits are **non-transferable** between accounts

## Token Pricing [#token-pricing]

For text models (chat, completions, embeddings), credits are consumed based on:

* **Input tokens** — the prompt you send
* **Output tokens** — the response generated
* Output tokens typically cost more than input tokens

For other modalities (image, audio, video, 3D generation), credits are consumed per unit generated.

## Model Pricing [#model-pricing]

Pricing varies by model and provider. Check the **Models** page in the dashboard for current pricing for each model.

Factors that affect pricing:

* **Model size** — larger models cost more per token
* **Provider** — different providers have different pricing
* **Modality** — image, audio, video, 3D, and music generation have per-unit pricing
* **Features** — reasoning/thinking tokens may have separate pricing

## Billing Cycle [#billing-cycle]

* Charges are calculated in real-time as you make API requests
* Usage is tracked per API key and per model
* View detailed usage breakdown in the **Analytics** dashboard

## Subscription Tiers [#subscription-tiers]

Yunxin offers four subscription tiers with increasing benefits:

| Tier           | Access Level                                              | Priority | Rate Limit  |
| -------------- | --------------------------------------------------------- | -------- | ----------- |
| **Free**       | Access to open-source and entry-level models              | Low      | 10 req/min  |
| **Basic**      | Access to standard models from all providers              | Normal   | 50 req/min  |
| **Pro**        | Access to premium models including GPT-4, Claude, Gemini  | High     | 100 req/min |
| **Enterprise** | Full access to all models including experimental releases | Critical | Unlimited   |

### Tier Benefits [#tier-benefits]

| Feature            | Free       | Basic      | Pro         | Enterprise   |
| ------------------ | ---------- | ---------- | ----------- | ------------ |
| **Priority Level** | Low (1)    | Normal (2) | High (3)    | Critical (4) |
| **Rate Limit**     | 10 req/min | 50 req/min | 100 req/min | Unlimited    |
| **Webhooks**       | ❌          | ❌          | ✅           | ✅            |

### Request Priority [#request-priority]

Your subscription tier determines your **request priority level**:

* **Low (1)** — Free tier requests are processed with standard priority
* **Normal (2)** — Basic tier requests receive normal processing priority
* **High (3)** — Pro tier requests are prioritized over Free and Basic
* **Critical (4)** — Enterprise tier requests receive the highest priority

The priority level is included in API responses via the `X-Request-Priority` header and is also available in your user profile at `/api/auth/me`.

### Model Access [#model-access]

Model access is controlled by the `min_tier` setting on each model. If your account tier is lower than a model's requirement, the API will return a `403 Forbidden` error with code `model_tier_restricted`.

### Free Tier [#free-tier]

New accounts receive a **starter credit** to explore the API. This includes:

* Access to Free-tier models
* Full API functionality
* Dashboard and analytics access

To access higher-tier models, upgrade your subscription in **Dashboard → Settings → Billing**.

## Refund Policy [#refund-policy]

Unused credits may be eligible for refund within 14 days of purchase. See our [Refund Policy](/docs/legal/refund) for full details.

## Cost Optimization Tips [#cost-optimization-tips]

1. **Use smaller models** for simple tasks — check the Models API for available options
2. **Set `max_tokens`** to limit response length
3. **Use Batch API** for non-urgent bulk processing
4. **Cache responses** for repeated identical requests
5. **Monitor usage** in the Analytics dashboard to identify optimization opportunities