Qwen3.5 Omni Flash API: Pricing, Playground & Docs

About Qwen3.5 Omni Flash

Cost-efficient omni-modal model handling text, image, audio, and video, with up to 3 hours of audio and 1 hour of video across 90+ languages.

Also known as Alibaba Cloud Qwen3.5 Omni Flash, Qwen3.5-Omni-Flash, qwen3-5-omni-flash

visionaudio inaudio outmultilingualfunction callingweb search

Qwen3.5 Omni Flash specs

Model ID: qwen3-5-omni-flash
Provider: Alibaba Cloud
Category: Text Generation
Released: Mar 30, 2026
Context window: 256K tokens
Max output: 32,768 tokens
Input: TextImageVideoAudio
Output: TextAudio
Structured output: JSON Schema
Region: Singapore
Endpoints: POST/v1/chat/completionsPOST/v1/responsesPOST/v1/messagesPOST/v1/audio/speechPOST/v1beta/models/qwen3-5-omni-flash:generateContent
Alternate model IDs: qwen3.5-omni-flash

Qwen3.5 Omni Flash API pricing

Live pay-as-you-go rates from the EmpirioLabs catalog. You are billed only for what you use, with no monthly minimum.

Type

Spec

Rate

Input

per 1M prompt tokens

per 1M prompt tokens $0.40per 1M prompt tokens $3.00

Output

per 1M generated tokens

per 1M generated tokens $2.20per 1M generated tokens $11.90

Web search

per request

$0.015

Compare on the full pricing page

How to call the Qwen3.5 Omni Flash API

Qwen3.5 Omni Flash serves the OpenAI-compatible Chat Completions API. Point any OpenAI SDK at https://api.empiriolabs.ai/v1 with your EmpirioLabs API key and use the model id qwen3-5-omni-flash. Get an API key from the EmpirioLabs dashboard.

cURL

curl https://api.empiriolabs.ai/v1/chat/completions \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-5-omni-flash",
    "messages": [
      {"role": "user", "content": "Write a haiku about the ocean."}
    ]
  }'

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.empiriolabs.ai/v1",
    api_key="YOUR_EMPIRIOLABS_API_KEY",
)

response = client.chat.completions.create(
    model="qwen3-5-omni-flash",
    messages=[{"role": "user", "content": "Write a haiku about the ocean."}],
)
print(response.choices[0].message.content)

Full Qwen3.5 Omni Flash API reference

Qwen3.5 Omni Flash API parameters

Request parameters supported by the Qwen3.5 Omni Flash API on EmpirioLabs. Defaults apply when a field is omitted.

Parameter	Type	Default	Range / values	Description
temperature	number	0.7	0 to 2	Sampling temperature. 0 = deterministic, 2 = maximum randomness.
top_p	number	0.9	0 to 1	Nucleus sampling probability mass. Lower = more focused.
max_tokens	number	4096	1 to 32768	Maximum tokens in the response.
output_mode	enum	text	text, text_audio	Output format mode. text = text only, audio = include synthesized speech.
voice	string	Tina	-	Voice name for audio output (when output_mode = audio).
tool_web_search	boolean	false	-	Allow the model to perform web searches when needed.
video_fps	number	2	0.1 to 10	Frames-per-second sampled from input video for analysis.
vl_high_resolution_images	boolean	true	-	Use higher resolution for input images. Better detail at higher cost.
max_pixels	number	2621440	1 to 99999999	Maximum pixels per input image. Larger = more detail but slower / more tokens.
response_format	enum	-	-	Constrain the output to JSON. Use JSON mode for any valid JSON object, or JSON schema to force output that matches a schema you provide.

Good to know

Audio billing

Audio is billed at a higher token rate than text/image/video
When audio output is enabled, output text is NOT charged — only audio tokens

Voice and language

55 voice timbres available
Audio output supports 29 languages, 7 dialects

Per-tool billing (usage.tool_usage)

When this model invokes tools (web search, code interpreter, etc.) inside a single request, the response carries a normalized usage.tool_usage map alongside the token counts. The example below shows the shape — exact field names, units, and which tools appear can vary slightly per provider:

"usage": {
  "prompt_tokens": 123,
  "completion_tokens": 456,
  "cost_usd": 0.0042,
  "tool_usage": {"web_search": 3, "code_interpreter": 1}
}

The tool counts are already factored into cost_usd — they are surfaced for transparency so you can audit per-tool billing. The field is omitted when no tools were invoked.

Qwen3.5 Omni Flash API: common questions

How much does the Qwen3.5 Omni Flash API cost?

On EmpirioLabs, Qwen3.5 Omni Flash is billed pay as you go. The live rate card on this page always matches what the API charges.

What is the context window of Qwen3.5 Omni Flash?

Qwen3.5 Omni Flash supports a 256K-token context window with up to 32,768 output tokens per response.

Is the Qwen3.5 Omni Flash API OpenAI-compatible?

Yes. Qwen3.5 Omni Flash serves the OpenAI-compatible Chat Completions API, so existing OpenAI SDKs work by pointing base_url at https://api.empiriolabs.ai/v1 and setting the model id to qwen3-5-omni-flash.

Can I try Qwen3.5 Omni Flash in the browser before integrating?

Yes. The EmpirioLabs playground runs Qwen3.5 Omni Flash in the browser with the same parameters the API exposes, so you can test prompts before writing code.

How do I get a Qwen3.5 Omni Flash API key?

Create an EmpirioLabs account, then generate a key under API Keys in the dashboard. Billing is pay-as-you-go credits, so you only pay for the requests you make.

Qwen3.5 Omni Flash API

About Qwen3.5 Omni Flash

Qwen3.5 Omni Flash specs

Qwen3.5 Omni Flash API pricing

How to call the Qwen3.5 Omni Flash API

Qwen3.5 Omni Flash API parameters

Good to know

Audio billing

Voice and language

Per-tool billing (usage.tool_usage)

Qwen3.5 Omni Flash API: common questions

How much does the Qwen3.5 Omni Flash API cost?

What is the context window of Qwen3.5 Omni Flash?

Is the Qwen3.5 Omni Flash API OpenAI-compatible?

Can I try Qwen3.5 Omni Flash in the browser before integrating?

How do I get a Qwen3.5 Omni Flash API key?

More Text Generation model APIs

GLM 5.2

Kimi K3

Kimi K2.7 Code

Muse Spark 1.1

Fugu Ultra v1.1

Qwen3.7 Plus

Ready to use better endpoints?