MiMo V2.5 API: Pricing, Playground & Docs

About MiMo V2.5

Multimodal model with native visual and audio understanding on a 1M context, designed to reason and act across modalities in agentic workflows.

Also known as Xiaomi MiMo V2.5, MiMo-V2.5, mimo-v2-5

visionaudio infunction callingreasoningweb search

MiMo V2.5 specs

Model ID: mimo-v2-5
Provider: Xiaomi
Category: Text Generation
Released: Apr 22, 2026
Context window: 1M tokens
Max output: 128,000 tokens
Input: TextImageVideoAudio
Output: Text
Structured output: JSON Mode
Endpoints: POST/v1/chat/completionsPOST/v1/responsesPOST/v1/messagesPOST/v1beta/models/mimo-v2-5:generateContent
Alternate model IDs: mimo-v2.5mimo/v2.5xiaomi/mimo-v2.5

MiMo V2.5 API pricing

Live pay-as-you-go rates from the EmpirioLabs catalog. You are billed only for what you use, with no monthly minimum.

Type

Spec

Rate

Input

per 1M prompt tokens

$0.70

Output

per 1M generated tokens

$1.40

Implicit cache read

per 1M cached input tokens

$0.014

Web search

per request when enabled

$0.015

Compare on the full pricing page

How to call the MiMo V2.5 API

MiMo V2.5 serves the OpenAI-compatible Chat Completions API. Point any OpenAI SDK at https://api.empiriolabs.ai/v1 with your EmpirioLabs API key and use the model id mimo-v2-5. Get an API key from the EmpirioLabs dashboard.

cURL

curl https://api.empiriolabs.ai/v1/chat/completions \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mimo-v2-5",
    "messages": [
      {"role": "user", "content": "Write a haiku about the ocean."}
    ]
  }'

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.empiriolabs.ai/v1",
    api_key="YOUR_EMPIRIOLABS_API_KEY",
)

response = client.chat.completions.create(
    model="mimo-v2-5",
    messages=[{"role": "user", "content": "Write a haiku about the ocean."}],
)
print(response.choices[0].message.content)

Full MiMo V2.5 API reference

MiMo V2.5 API parameters

Request parameters supported by the MiMo V2.5 API on EmpirioLabs. Defaults apply when a field is omitted.

Parameter	Type	Default	Range / values	Description
enable_thinking	boolean	true	-	Enable extended thinking mode. Slower but improves reasoning-heavy tasks.
tool_web_search	boolean	false	-	Allow the model to perform web searches when needed.
web_search_force	boolean	false	-	Force the model to always run a web search before answering.
web_search_max_keyword	number	3	1 to 5	Max number of keywords the model can use across web searches.
web_search_limit	number	5	1 to 10	Max number of web searches the model can perform per request.
video_fps	number	2	0.1 to 10	Frames-per-second sampled from input video for analysis.
video_resolution	enum	default	default, max	Resolution at which input video is sampled (e.g. 360p, 480p, 720p).
temperature	number	0.7	0 to 2	Sampling temperature. 0 = deterministic, 2 = maximum randomness.
top_p	number	0.9	0 to 1	Nucleus sampling probability mass. Lower = more focused.
max_tokens	number	4096	1 to 65536	Maximum tokens in the response.
stop	string	-	-	Up to 4 strings where the model will stop generating further tokens.
response_format	enum	-	-	Return the output as a valid JSON object (JSON mode). Describe the fields you want in your prompt.
disable_formatting	boolean	false	-	Skip the EmpirioLabs Markdown formatting (citation [[N]](url) rewriting + References block when web search was used). The raw upstream answer with plain [N]...

Good to know

Omnimodal input (text, image, video, audio) with text output. Web search ($0.015/call) is charged only when invoked. Cached input tokens are billed at a steep discount.

Per-tool billing (usage.tool_usage)

When this model invokes tools (web search, code interpreter, etc.) inside a single request, the response carries a normalized usage.tool_usage map alongside the token counts. The example below shows the shape — exact field names, units, and which tools appear can vary slightly per provider:

"usage": {
  "prompt_tokens": 123,
  "completion_tokens": 456,
  "cost_usd": 0.0042,
  "tool_usage": {"web_search": 3, "code_interpreter": 1}
}

The tool counts are already factored into cost_usd — they are surfaced for transparency so you can audit per-tool billing. The field is omitted when no tools were invoked.

MiMo V2.5 API: common questions

How much does the MiMo V2.5 API cost?

On EmpirioLabs, MiMo V2.5 is billed pay as you go: Input $0.70 per 1M prompt tokens; Output $1.40 per 1M generated tokens; Implicit cache read $0.014 per 1M cached input tokens. The live rate card on this page always matches what the API charges.

What is the context window of MiMo V2.5?

MiMo V2.5 supports a 1M-token context window with up to 128,000 output tokens per response.

Is the MiMo V2.5 API OpenAI-compatible?

Yes. MiMo V2.5 serves the OpenAI-compatible Chat Completions API, so existing OpenAI SDKs work by pointing base_url at https://api.empiriolabs.ai/v1 and setting the model id to mimo-v2-5.

Can I try MiMo V2.5 in the browser before integrating?

Yes. The EmpirioLabs playground runs MiMo V2.5 in the browser with the same parameters the API exposes, so you can test prompts before writing code.

How do I get a MiMo V2.5 API key?

Create an EmpirioLabs account, then generate a key under API Keys in the dashboard. Billing is pay-as-you-go credits, so you only pay for the requests you make.

MiMo V2.5 API

About MiMo V2.5

MiMo V2.5 specs

MiMo V2.5 API pricing

How to call the MiMo V2.5 API

MiMo V2.5 API parameters

Good to know

Per-tool billing (usage.tool_usage)

MiMo V2.5 API: common questions

How much does the MiMo V2.5 API cost?

What is the context window of MiMo V2.5?

Is the MiMo V2.5 API OpenAI-compatible?

Can I try MiMo V2.5 in the browser before integrating?

How do I get a MiMo V2.5 API key?

More Text Generation model APIs

GLM 5.2

Kimi K3

Kimi K2.7 Code

Muse Spark 1.1

Fugu Ultra v1.1

Qwen3.7 Plus

Ready to use better endpoints?