
Cost-efficient omni-modal model handling text, image, audio, and video, with up to 3 hours of audio and 1 hour of video across 90+ languages.
Cost-efficient omni-modal model handling text, image, audio, and video, with up to 3 hours of audio and 1 hour of video across 90+ languages.
qwen3-5-omni-flashPOST /v1/chat/completionsPOST /v1/responsesPOST /v1/messagesPOST /v1/audio/speechLive pay-as-you-go rates from the EmpirioLabs catalog. You are billed only for what you use, with no monthly minimum.
Qwen3.5 Omni Flash serves the OpenAI-compatible Chat Completions API. Point any OpenAI SDK at https://api.empiriolabs.ai/v1 with your EmpirioLabs API key and use the model id qwen3-5-omni-flash. Get an API key from the EmpirioLabs dashboard.
curl https://api.empiriolabs.ai/v1/chat/completions \
-H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-5-omni-flash",
"messages": [
{"role": "user", "content": "Write a haiku about the ocean."}
]
}'from openai import OpenAI
client = OpenAI(
base_url="https://api.empiriolabs.ai/v1",
api_key="YOUR_EMPIRIOLABS_API_KEY",
)
response = client.chat.completions.create(
model="qwen3-5-omni-flash",
messages=[{"role": "user", "content": "Write a haiku about the ocean."}],
)
print(response.choices[0].message.content)Request parameters supported by the Qwen3.5 Omni Flash API on EmpirioLabs. Defaults apply when a field is omitted.
| Parameter | Type | Default | Range / values | Description |
|---|---|---|---|---|
| temperature | number | 0.7 | 0 to 2 | Sampling temperature. 0 = deterministic, 2 = maximum randomness. |
| top_p | number | 0.9 | 0 to 1 | Nucleus sampling probability mass. Lower = more focused. |
| max_tokens | number | 4096 | 1 to 32768 | Maximum tokens in the response. |
| output_mode | enum | text | text, text_audio | Output format mode. text = text only, audio = include synthesized speech. |
| voice | string | Tina | - | Voice name for audio output (when output_mode = audio). |
| tool_web_search | boolean | false | - | Allow the model to perform web searches when needed. |
| video_fps | number | 2 | 0.1 to 10 | Frames-per-second sampled from input video for analysis. |
| vl_high_resolution_images | boolean | true | - | Use higher resolution for input images. Better detail at higher cost. |
| max_pixels | number | 2621440 | 1 to 99999999 | Maximum pixels per input image. Larger = more detail but slower / more tokens. |
On EmpirioLabs, Qwen3.5 Omni Flash is billed pay as you go: Input per 1M prompt tokens $0.40; per 1M prompt tokens $3.00 per 1M prompt tokens; Output per 1M generated tokens $2.20; per 1M generated tokens $11.90 per 1M generated tokens; Web Search $0.015 per request. The live rate card on this page always matches what the API charges.
Qwen3.5 Omni Flash supports a 256K-token context window with up to 32,768 output tokens per response.
Yes. Qwen3.5 Omni Flash serves the OpenAI-compatible Chat Completions API, so existing OpenAI SDKs work by pointing base_url at https://api.empiriolabs.ai/v1 and setting the model id to qwen3-5-omni-flash.
Yes. The EmpirioLabs playground runs Qwen3.5 Omni Flash in the browser with the same parameters the API exposes, so you can test prompts before writing code.
Create an EmpirioLabs account, then generate a key under API Keys in the dashboard. Billing is pay-as-you-go credits, so you only pay for the requests you make.
Explore our models, or contact us about business inquiries, custom deployments, or anything else.