Qwen3.5 Omni Flash API

Cost-efficient omni-modal model handling text, image, audio, and video, with up to 3 hours of audio and 1 hour of video across 90+ languages.

Alibaba CloudText Generation256K contextSingaporeProprietary Endpoint

About Qwen3.5 Omni Flash

Cost-efficient omni-modal model handling text, image, audio, and video, with up to 3 hours of audio and 1 hour of video across 90+ languages.

visionaudio inaudio outmultilingual

Qwen3.5 Omni Flash specs

Model ID
qwen3-5-omni-flash
Provider
Alibaba Cloud
Category
Text Generation
Context window
256K tokens
Max output
32,768 tokens
Input
text, image, video, audio
Output
text, audio
Region
Singapore
Endpoints
POST /v1/chat/completions
POST /v1/responses
POST /v1/messages
POST /v1/audio/speech

Qwen3.5 Omni Flash API pricing

Live pay-as-you-go rates from the EmpirioLabs catalog. You are billed only for what you use, with no monthly minimum.

Type
Spec
Rate
Input
per 1M prompt tokens
per 1M prompt tokens $0.40per 1M prompt tokens $3.00
Output
per 1M generated tokens
per 1M generated tokens $2.20per 1M generated tokens $11.90
Web Search
per request
$0.015
Compare on the full pricing page

How to call the Qwen3.5 Omni Flash API

Qwen3.5 Omni Flash serves the OpenAI-compatible Chat Completions API. Point any OpenAI SDK at https://api.empiriolabs.ai/v1 with your EmpirioLabs API key and use the model id qwen3-5-omni-flash. Get an API key from the EmpirioLabs dashboard.

cURL
curl https://api.empiriolabs.ai/v1/chat/completions \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-5-omni-flash",
    "messages": [
      {"role": "user", "content": "Write a haiku about the ocean."}
    ]
  }'
Python (OpenAI SDK)
from openai import OpenAI

client = OpenAI(
    base_url="https://api.empiriolabs.ai/v1",
    api_key="YOUR_EMPIRIOLABS_API_KEY",
)

response = client.chat.completions.create(
    model="qwen3-5-omni-flash",
    messages=[{"role": "user", "content": "Write a haiku about the ocean."}],
)
print(response.choices[0].message.content)
Full Qwen3.5 Omni Flash API reference

Qwen3.5 Omni Flash API parameters

Request parameters supported by the Qwen3.5 Omni Flash API on EmpirioLabs. Defaults apply when a field is omitted.

ParameterTypeDefaultRange / valuesDescription
temperaturenumber0.70 to 2Sampling temperature. 0 = deterministic, 2 = maximum randomness.
top_pnumber0.90 to 1Nucleus sampling probability mass. Lower = more focused.
max_tokensnumber40961 to 32768Maximum tokens in the response.
output_modeenumtexttext, text_audioOutput format mode. text = text only, audio = include synthesized speech.
voicestringTina-Voice name for audio output (when output_mode = audio).
tool_web_searchbooleanfalse-Allow the model to perform web searches when needed.
video_fpsnumber20.1 to 10Frames-per-second sampled from input video for analysis.
vl_high_resolution_imagesbooleantrue-Use higher resolution for input images. Better detail at higher cost.
max_pixelsnumber26214401 to 99999999Maximum pixels per input image. Larger = more detail but slower / more tokens.

Good to know

Audio billing

  • Audio is billed at a higher token rate than text/image/video
  • When audio output is enabled, output text is NOT charged — only audio tokens

Voice and language

  • 55 voice timbres available
  • Audio output supports 29 languages, 7 dialects

Qwen3.5 Omni Flash API: common questions

How much does the Qwen3.5 Omni Flash API cost?

On EmpirioLabs, Qwen3.5 Omni Flash is billed pay as you go: Input per 1M prompt tokens $0.40; per 1M prompt tokens $3.00 per 1M prompt tokens; Output per 1M generated tokens $2.20; per 1M generated tokens $11.90 per 1M generated tokens; Web Search $0.015 per request. The live rate card on this page always matches what the API charges.

What is the context window of Qwen3.5 Omni Flash?

Qwen3.5 Omni Flash supports a 256K-token context window with up to 32,768 output tokens per response.

Is the Qwen3.5 Omni Flash API OpenAI-compatible?

Yes. Qwen3.5 Omni Flash serves the OpenAI-compatible Chat Completions API, so existing OpenAI SDKs work by pointing base_url at https://api.empiriolabs.ai/v1 and setting the model id to qwen3-5-omni-flash.

Can I try Qwen3.5 Omni Flash in the browser before integrating?

Yes. The EmpirioLabs playground runs Qwen3.5 Omni Flash in the browser with the same parameters the API exposes, so you can test prompts before writing code.

How do I get a Qwen3.5 Omni Flash API key?

Create an EmpirioLabs account, then generate a key under API Keys in the dashboard. Billing is pay-as-you-go credits, so you only pay for the requests you make.

Ready to use better endpoints?

Explore our models, or contact us about business inquiries, custom deployments, or anything else.