Pricing

Pay only for what you use. No subscription lock-ins.

Pricing catalog

Pay-as-you-go AI model pricing.

EmpirioLabs pricing varies by model and unit: tokens, images, seconds of audio or video, messages, 3D assets, and search requests. The interactive pricing table loads the current catalog, while these representative rates remain readable without client JavaScript.

Open model docs

GLM 5.2

Z.aiCtx 1MText Generation
Proprietary Endpoint

Reasoning and coding model with a 1M token context, 128K output, adjustable reasoning effort, native web search, and tool calling.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.40
Output
per 1M generated tokens
$4.40
Web Search
per request
$0.033

Kimi K2.7 Code

Moonshot AICtx 256KText Generation
Proprietary Endpoint

Kimi K2.7 Code is Moonshot's trillion-parameter agentic coding model with 256K context, always-on reasoning, and text, image, and video inputs.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.95
Output
per 1M generated tokens
$4.00
Web search
per call when invoked
$0.015

Qwen3.7 Plus

Alibaba CloudSingaporeCtx 1MText Generation
Proprietary Endpoint

Cost-effective Qwen3.7 vision-language model for text, image, video, coding, tool use, GUI understanding, and 1M-context workflows.

Type
Spec
Rate
Input
per 1M prompt tokens
<=256K $0.40256K-1M $1.20
Output
per 1M generated tokens
<=256K $1.60256K-1M $4.80
Web Search
per call
$0.03
Image Search
per call
$0.03

Qwen3.7 Plus

Alibaba CloudChinaCtx 1MText Generation
Proprietary Endpoint
Save up to 31%

Cost-effective Qwen3.7 vision-language model for text, image, video, coding, tool use, GUI understanding, and 1M-context workflows.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.40<=256K $0.276$1.20256K-1M $0.826
Output
per 1M generated tokens
$1.60<=256K $1.101$4.80256K-1M $3.301
Implicit cache input
per 1M cached prompt tokens
$0.08<=256K $0.056$0.24256K-1M $0.166
Web Search
per call
$0.01
Image Search
per call
$0.01

MiniMax M3

MiniMaxSingaporeCtx 524KText Generation
Proprietary Endpoint
Save up to 25%

MiniMax M3 is a multimodal reasoning model for coding, agents, and long-context analysis with text, image, and video input.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.30<=512K $0.225>512K $1.20
Output
per 1M generated tokens
$1.20<=512K $0.90>512K $4.80
Implicit cache read
per 1M cached input tokens
$0.06<=512K $0.045>512K $0.24
Linkup web search
per successful search when enabled
$0.013

Qwen3.7 Max

Alibaba CloudSingaporeCtx 1MText Generation
Proprietary Endpoint

Qwen3.7 Max is a flagship text model for coding, productivity, long-running agents, deep thinking, tools, and 1M-token context.

Type
Spec
Rate
Input
per 1M prompt tokens
$2.50
Output
per 1M generated tokens
$7.50
Web search
per call when invoked
$0.02
Web extractor
per call when invoked
$0.02
Code interpreter
per call when invoked
$0.02

Qwen3.7 Max

Alibaba CloudChinaCtx 1MText Generation
Proprietary Endpoint
Save up to 34%

Qwen3.7 Max is a flagship text model for coding, productivity, long-running agents, deep thinking, tools, and 1M-token context.

Type
Spec
Rate
Input
per 1M prompt tokens
$2.50$1.65
Output
per 1M generated tokens
$7.50$4.951
Web search
per call when invoked
$0.01
Web extractor
per call when invoked
$0.01
Code interpreter
per call when invoked
$0.01

MiniMax M2.7 Highspeed

MiniMaxSingaporeCtx 200KText Generation
Proprietary Endpoint
Save up to 50%

High-speed M2.7 variant tuned for fast inference with strong general-purpose performance with strong agentic capabilities.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.60$0.30
Output
per 1M generated tokens
$2.40$1.20
Implicit cache read
per 1M cached input tokens
$0.06$0.03
Web Search (Linkup)
per call when invoked
$0.013

TTS 1.5 Mini

InworldAudio Generation
Proprietary Endpoint
Save up to 30%

Sub-130ms TTFB voice synthesis with 271+ voices across 15 languages, expressive prosody, and real-time SSE streaming for low-latency voice agents.

Type
Spec
Rate
Synthesis
per 1M characters
$25.00$17.50

TTS 1.5 Max

InworldAudio Generation
Proprietary Endpoint
Save up to 15%

Broadcast-quality voice synthesis with rich expressive prosody, 271+ voices across 15 languages, and real-time SSE streaming with per-word timestamps.

Type
Spec
Rate
Synthesis
per 1M characters
$35.00$29.75

GLM 5.1

Z.aiChinaCtx 202KText Generation
Proprietary Endpoint
Save up to 41%

Long-context Zhipu AI reasoning model with 202K context, 128K output, tool calling, structured output, and cache support.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.40<=32K $0.825$1.4032K-200K $1.10
Output
per 1M generated tokens
$4.40<=32K $3.301$4.4032K-200K $3.851
Implicit cache read
per 1M cached input tokens
$0.26<=32K $0.165$0.2632K-200K $0.22
Web Search (Linkup)
per call when invoked
$0.013

Kimi K2.6

Moonshot AIChinaCtx 256KText Generation
Proprietary Endpoint
Save up to 7%

Kimi K2.6 is a Moonshot multimodal reasoning model with 256K context, strong coding, and text, image, and video inputs.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.95$0.8939
Output
per 1M generated tokens
$4.00$3.7131
Implicit cache read
per 1M cached input tokens
$0.1788
Web Search (Linkup)
per call when invoked
$0.013

MiniMax M2.7

MiniMaxSingaporeCtx 200KText Generation
Proprietary Endpoint
Save up to 50%

MiniMax M2.7 is a general-purpose reasoning chat model with interleaved thinking, function calling, and prompt caching.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.30$0.15
Output
per 1M generated tokens
$1.20$0.60
Implicit cache read
per 1M cached input tokens
$0.06$0.03
Web Search (Linkup)
per call when invoked
$0.013

Qwen3.5 122B-A10B

Alibaba CloudChinaCtx 256KText Generation
Proprietary Endpoint
Save up to 71%

Qwen3.5 122B-A10B is a multimodal reasoning model with 256K context, efficient sparse MoE inference, and text, image, and video input.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.40<=128K $0.115$0.40128K-256K $0.287
Output
per 1M generated tokens
$3.20<=128K $0.917$3.20128K-256K $2.294
Web search
per request when enabled
$0.01

Qwen3.5 397B-A17B

Alibaba CloudChinaCtx 256KText Generation
Proprietary Endpoint
Save up to 71%

Qwen3.5 397B-A17B is a flagship multimodal reasoning model for language, code, agents, GUI tasks, and image and video understanding.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.60<=128K $0.172$0.60128K-256K $0.43
Output
per 1M generated tokens
$3.60<=128K $1.032$3.60128K-256K $2.58
Web search
per request when enabled
$0.01

Qwen3.5 35B-A3B

Alibaba CloudChinaCtx 256KText Generation
Proprietary Endpoint
Save up to 77%

Qwen3.5 35B-A3B is an efficient native vision-language model with sparse MoE routing, deep thinking, and text, image, and video input.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.25<=128K $0.057$0.25128K-256K $0.229
Output
per 1M generated tokens
$2.00<=128K $0.459$2.00128K-256K $1.835
Web search
per request when enabled
$0.01

Qwen3.5 27B

Alibaba CloudChinaCtx 256KText Generation
Proprietary Endpoint
Save up to 71%

Qwen3.5 27B is a dense multimodal reasoning model with fast responses, 256K context, and text, image, and video understanding.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.30<=128K $0.086$0.30128K-256K $0.258
Output
per 1M generated tokens
$2.40<=128K $0.688$2.40128K-256K $2.064
Web search
per request when enabled
$0.01

Qwen3.6 27B

Alibaba CloudChinaCtx 256KText Generation
Proprietary Endpoint
Save up to 31%

Qwen3.6 27B improves agentic coding, STEM reasoning, spatial vision, OCR, and text, image, and video understanding on 256K context.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.60$0.412564
Output
per 1M generated tokens
$3.60$2.475384
Web search
per request when enabled
$0.01

Qwen3.6 Flash

Alibaba CloudSingaporeCtx 1MText Generation
Proprietary Endpoint

Fast Qwen3.6 vision-language model for agentic coding, math reasoning, spatial understanding, OCR, and text, image, and video input.

Type
Spec
Rate
Input
per 1M prompt tokens
<=256K $0.25256K-1M $1.00
Output
per 1M generated tokens
<=256K $1.50256K-1M $4.00
Web search
per query when enabled
$0.02

Qwen3.6 Flash

Alibaba CloudChinaCtx 1MText Generation
Proprietary Endpoint
Save up to 34%

Fast Qwen3.6 vision-language model for agentic coding, math reasoning, spatial understanding, OCR, and text, image, and video input.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.25<=256K $0.165$1.00256K-1M $0.66
Output
per 1M generated tokens
$1.50<=256K $0.99$4.00256K-1M $3.961
Web search
per query when enabled
$0.01

GLM 4.7 Flash

Z.aiSingaporeCtx 200KText Generation
Proprietary Endpoint

Free lightweight GLM-4.7 text model for coding, reasoning, long-context writing, and general chat.

Type
Spec
Rate
Input
per 1M prompt tokens
Free
Output
per 1M generated tokens
Free
Implicit cache read
per 1M cached input tokens
Free
Web Search
per request when enabled
$0.033

GLM 4.5 Flash

Z.aiSingaporeCtx 200KText Generation
Proprietary Endpoint

Free lightweight GLM-4.5 text model for reasoning, coding, long-form chat, and general language tasks.

Type
Spec
Rate
Input
per 1M prompt tokens
Free
Output
per 1M generated tokens
Free
Implicit cache read
per 1M cached input tokens
Free
Web Search
per request when enabled
$0.033

GLM 4.6V Flash

Z.aiSingaporeCtx 128KText Generation
Proprietary Endpoint

Free multimodal GLM-4.6V model for image, video, file, and text understanding with native function calling.

Type
Spec
Rate
Input
per 1M prompt tokens
Free
Output
per 1M generated tokens
Free
Implicit cache read
per 1M cached input tokens
Free
Web Search
per request when enabled
$0.033

Amazon Nova Canvas

AmazonImage Generation
Proprietary Endpoint

Image generation and editing model creating and modifying images from text or image inputs, with inpainting, virtual try-on, and style controls.

Type
Spec
Rate
Small Standard (≤1024×1024)
per image
$0.12
Small Premium (≤1024×1024)
per image
$0.18
Large Standard (≤2048×2048)
per image
$0.18
Large Premium (≤2048×2048)
per image
$0.24

Amazon Nova Reel 1.1

AmazonVideo Generation
Proprietary Endpoint

Video generation model producing up to 2-minute multi-shot videos from text and optional image prompts with improved quality and consistency.

Type
Spec
Rate
Per Second
per second
$0.14

Deepgram Nova 3

DeepgramTranscription
Proprietary Endpoint

Speech-to-text transcription using the Nova-3 model with multi-language support and advanced customizable settings for production workloads.

Type
Spec
Rate
Transcription
per minute of audio
$0.014

DeepSeek Prover V2

DeepSeekText Generation
Proprietary Endpoint

Open-source LLM specialized in formal theorem proving in Lean 4, built on a recursive theorem-proving pipeline.

Type
Spec
Rate
Per Message
fixed
$0.020

DeepSeek V3.2

DeepSeekSingaporeCtx 128KText Generation
Proprietary Endpoint

Open-source Mixture-of-Experts LLM tuned for high-efficiency reasoning, coding, and general language tasks across long-form prompts.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.57
Output
per 1M generated tokens
$1.71
Web Search
per call
$0.015

DeepSeek V4 Flash

DeepSeekGermany (Frankfurt)Ctx 1MText Generation
Proprietary Endpoint

Lightweight MoE model with 284B total / 13B active parameters and native 1M context, tuned for low-latency, cost-effective high-concurrency use.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.14
Output
per 1M generated tokens
$0.28
Web Search (Linkup)
per call when invoked
$0.013

DeepSeek V4 Flash

DeepSeekSingaporeCtx 1MText Generation
Proprietary Endpoint

Lightweight MoE model with 284B total / 13B active parameters and native 1M context, tuned for low-latency, cost-effective high-concurrency use.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.20
Output
per 1M generated tokens
$0.40
Web search
per request when enabled
$0.02

DeepSeek V4 Flash

DeepSeekChinaCtx 1MText Generation
Proprietary Endpoint
Save up to 2%

Lightweight MoE model with 284B total / 13B active parameters and native 1M context, tuned for low-latency, cost-effective high-concurrency use.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.14$0.138
Output
per 1M generated tokens
$0.28$0.275
Implicit cache read
per 1M cached input tokens
$0.028
Web search
per request when enabled
$0.01

DeepSeek V4 Pro

DeepSeekGermany (Frankfurt)Ctx 1MText Generation
Proprietary Endpoint
Save up to 5%

Flagship MoE LLM with 1.6T total / 49B active parameters and native 1M context for advanced math, logical inference, and specialized coding.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.74$1.65
Output
per 1M generated tokens
$3.48$3.30
Web Search (Linkup)
per call when invoked
$0.013

DeepSeek V4 Pro

DeepSeekSingaporeCtx 1MText Generation
Proprietary Endpoint

Flagship MoE LLM with 1.6T total / 49B active parameters and native 1M context for advanced math, logical inference, and specialized coding.

Type
Spec
Rate
Input
per 1M prompt tokens
$2.40
Output
per 1M generated tokens
$4.80
Web search
per request when enabled
$0.02

DeepSeek V4 Pro

DeepSeekChinaCtx 1MText Generation
Proprietary Endpoint
Save up to 5%

Flagship MoE LLM with 1.6T total / 49B active parameters and native 1M context for advanced math, logical inference, and specialized coding.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.74$1.65
Output
per 1M generated tokens
$3.48$3.301
Implicit cache read
per 1M cached input tokens
$0.138
Web search
per request when enabled
$0.01

Exa Answer

ExaResearch & Search
Proprietary Endpoint

Quick LLM-style answer to a natural-language question, grounded in fresh Exa web search results with inline citations and source links.

Type
Spec
Rate
Answer
per request
$0.01

Exa Research

ExaResearch & Search
Proprietary Endpoint

Asynchronous research task that explores the web, gathers sources, synthesizes findings, and returns cited answers for in-depth queries.

Type
Spec
Rate
exa-research-fast (search)
per search
$0.013
exa-research (search)
per search
$0.013
exa-research-pro (search)
per search
$0.013
Page Read (standard)
per page
$0.013
Page Read (pro)
per page
$0.026
Reasoning Tokens
per 1k tokens
$0.013

Gemini 2.5 Flash TTS

GoogleAudio Generation
Proprietary Endpoint

Low-latency text-to-speech with single- and multi-speaker voices and controllable style, accent, and expressive tone for production apps.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.50
Output
per 1M generated tokens
$30.00

Gemini 2.5 Pro TTS

GoogleAudio Generation
Proprietary Endpoint

High-quality TTS preview for podcasts, audiobooks, and customer support, with expressive multi-speaker voices across 23+ languages.

Type
Spec
Rate
Input
per 1M prompt tokens
$3.00
Output
per 1M generated tokens
$60.00

Gemini 3.1 Flash TTS

GoogleAudio Generation
Proprietary Endpoint

Highly controllable TTS with new Audio Tags for precise style, tone, pace, and delivery across narration, assistants, and voice apps.

Type
Spec
Rate
Input
per 1M prompt tokens
$2.60
Output
per 1M generated tokens
$52.00

Gemma 3 27B

GoogleCtx 128KText Generation
Proprietary Endpoint

Open-source vision-language model with 128K context, 140+ languages, improved math/reasoning, structured outputs, and function calling.

Type
Spec
Rate
Per Message
fixed
$0.0040
Web Search (Linkup)
per call when invoked
$0.013

GPTZero

GPTZeroTools & Agents
Proprietary Endpoint

Deep-learning detector that flags portions of text likely generated by AI versus human, classifying content as entirely human, AI, or mixed.

Type
Spec
Rate
Text Scan
per 1,000 words
$0.39

HappyHorse 1.0

Alibaba CloudSingaporeVideo Generation
Proprietary Endpoint

Video model offering Text-to-Video, Image-to-Video, Reference-to-Video, and Video Edit modes with high-fidelity, motion-smooth output.

Type
Spec
Rate
All Modes 720P
per second
$0.14
All Modes 1080P
per second
$0.24

Hunyuan Image 3

TencentImage Generation
Proprietary Endpoint

Open-source text-to-image model on a multimodal Mixture-of-Experts architecture with photorealistic detail and strong multilingual text rendering.

Type
Spec
Rate
Standard
per image
$0.13

Janus-Pro DeepSeek

DeepSeekImage Generation
Proprietary Endpoint

Autoregressive framework on the Janus Pro 7B model that unifies multimodal understanding and image generation in one architecture.

Type
Spec
Rate
Image Generation
per image
$0.030
Image Analysis
per uploaded image
$0.030

Kling O3

Kling AIVideo Generation
Proprietary Endpoint

Video model in Standard or Pro modes with Text-to-Video, Image-to-Video, Reference-to-Video, editing, native sound, and multi-scene transitions.

Type
Spec
Rate
Standard T2V/I2V
per second
$0.168
Standard T2V/I2V Sound
per second
$0.224
Standard Video Input
per second
$0.252
Pro T2V/I2V
per second
$0.224
Pro T2V/I2V Sound
per second
$0.280
Pro Video Input
per second
$0.336
4K T2V/I2V/Ref
per second
$0.525

Kling v3 Motion Control

Kling AIVideo Generation
Proprietary Endpoint

Kling 3.0 model that transfers motion from a reference video onto a character from a reference image, with Standard 720p and Pro 1080p tiers.

Type
Spec
Rate
Standard (720p)
per second
$0.14
Pro (1080p)
per second
$0.18

Linkup Standard

LinkupCtx 100KResearch & Search
Proprietary Endpoint

AI-powered web search with detailed overviews and answers, faster than Deep Search. Ranks #1 on OpenAI SimpleQA benchmark.

Type
Spec
Rate
Per Message
fixed
$0.013

Magistral Medium 2509 Thinking

Mistral AICtx 40KText Generation
Proprietary Endpoint

Reasoning model tuned for tasks needing longer thought and higher accuracy: legal research, financial forecasting, software, and storytelling.

Type
Spec
Rate
Input
per 1M prompt tokens
$2.60
Output
per 1M generated tokens
$6.50
Web Search (Linkup)
per call when invoked
$0.013

Mistral Medium 3

Mistral AICtx 130KText Generation
Proprietary Endpoint

Cost-efficient language model offering strong reasoning and multimodal performance for general production workloads at competitive latency.

Type
Spec
Rate
Per Message
fixed
$0.015
Web Search (Linkup)
per call when invoked
$0.013

Mistral Medium 3.1

Mistral AICtx 131KText Generation
Proprietary Endpoint

Enterprise-grade model with strong reasoning, coding, and STEM performance, supporting hybrid, on-prem, and in-VPC deployments.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.52
Output
per 1M generated tokens
$2.60
Web Search (Linkup)
per call when invoked
$0.013

Mistral Small 3.1

Mistral AICtx 128KText Generation
Proprietary Endpoint

24B-parameter multimodal model with 128K context for image analysis, programming, math, and multilingual tasks, tuned for efficient local inference.

Type
Spec
Rate
Per Message
fixed
$0.0019
Web Search (Linkup)
per call when invoked
$0.013

Mistral Small 4

Mistral AICtx 256KText Generation
Proprietary Endpoint

Hybrid model unifying Instruct, Reasoning (Magistral), and Devstral families: 40% lower completion time and 3x throughput vs Small 3.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.15
Output
per 1M generated tokens
$0.60
Standard Web Search
per call
$0.084
Premium Web Search
per call
$0.140
Code Interpreter
per call
$0.084
Image Generation
per image
$0.280

Nova Lite 1.0

AmazonCtx 300KText Generation
Proprietary Endpoint

Low-cost multimodal foundation model for text, images, and video on a 300K context (up to ~30 min video), tuned for speed and affordability.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.069
Output
per 1M generated tokens
$0.28
Cached input
per 1M tokens
$0.0386
Web Search (Linkup)
per call when invoked
$0.013

Nova Lite 2

AmazonCtx 1MText Generation
Proprietary Endpoint

Fast, cost-effective multimodal reasoning model for text, images, documents, and video on a 1M context (long docs and ~90 min clips).

Type
Spec
Rate
Input
per 1M prompt tokens
$0.38
Output
per 1M generated tokens
$3.16
Cached input
per 1M tokens
$0.2128
Web Search (Linkup)
per call when invoked
$0.013

Nova Micro 1.0

AmazonCtx 128KText Generation
Proprietary Endpoint

Text-only foundation model tuned for ultra-low latency and cost on 128K context. Strong for summarization, translation, and chat with 44% cache discount.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.040
Output
per 1M generated tokens
$0.16
Cached input
per 1M tokens
$0.0224
Web Search (Linkup)
per call when invoked
$0.013

Nova Premier 1.0

AmazonCtx 1MText Generation
Proprietary Endpoint

Most capable model in the family. Multimodal text/image/video on a 1M context with chain-of-thought reasoning across tools and data sources.

Type
Spec
Rate
Input
per 1M prompt tokens
$3.00
Output
per 1M generated tokens
$15.00
Cached input
per 1M tokens
$1.68
Web Search (Linkup)
per call when invoked
$0.013

Nova Pro 1.0

AmazonCtx 300KText Generation
Proprietary Endpoint

Multimodal foundation model balancing accuracy, speed, and cost for text, images, and video on 300K context (up to ~30 min video).

Type
Spec
Rate
Input
per 1M prompt tokens
$2.40
Output
per 1M generated tokens
$9.60
Latency Optimized Input
per 1M prompt tokens
$3.00
Latency Optimized Output
per 1M generated tokens
$12.00
Web Search (Linkup)
per call when invoked
$0.013

OpenAI Whisper 1

OpenAITranscription
Proprietary Endpoint

Whisper-1 speech-to-text transcription trained on multilingual supervised audio, with a 25 MB upload limit per file.

Type
Spec
Rate
Per Minute of Audio
per minute
$0.030

Perplexity Advanced Deep Research

PerplexityResearch & Search
Proprietary Endpoint

Institutional-grade research powered by Claude Opus 4.6 reasoning, with maximum depth, enhanced tool access, and extensive source coverage.

Type
Spec
Rate
Input
per 1M prompt tokens
$12.00
Output
per 1M generated tokens
$60.00
Web Search Call
per call
$0.012
URL Fetch Call
per call
$0.0012

Perplexity Deep Research

PerplexityCtx 128KResearch & Search
Proprietary Endpoint

Research model for multi-step retrieval, synthesis, and reasoning, autonomously searching, reading, and evaluating sources across complex topics.

Type
Spec
Rate
Input
per 1M prompt tokens
$4.80
Output
per 1M generated tokens
$19.00
Citation Tokens
per 1M tokens
$4.80
Reasoning Tokens
per 1M tokens
$7.20
Search Queries
per query
$0.012

Perplexity Sonar

PerplexityCtx 127KResearch & Search
Proprietary Endpoint

Real-time web-connected search with accurate citations and customizable sources for up-to-date AI search integration in production apps.

Type
Spec
Rate
Input
per 1M prompt tokens
$2.40
Output
per 1M generated tokens
$2.40
Base Fee (Low Context)
per request
$0.012
Base Fee (Medium Context)
per request
$0.019
Base Fee (High Context)
per request
$0.029

Perplexity Sonar Pro

PerplexityCtx 200KResearch & Search
Proprietary Endpoint

Search-grounded model with double the citations and a larger context window, tuned for complex queries needing in-depth, nuanced answers.

Type
Spec
Rate
Input
per 1M prompt tokens
$7.20
Output
per 1M generated tokens
$36.00
Base Fee (Low Context)
per request
$0.014
Base Fee (Medium Context)
per request
$0.024
Base Fee (High Context)
per request
$0.034

Perplexity Sonar Reasoning Pro

PerplexityCtx 128KResearch & Search
Proprietary Endpoint

Reasoning model on the uncensored open-source R1-1776 with web search, outperforming leading search engines and LLMs on the SimpleQA benchmark.

Type
Spec
Rate
Input
per 1M prompt tokens
$4.80
Output
per 1M generated tokens
$19.00
Base Fee (Low Context)
per request
$0.014
Base Fee (Medium Context)
per request
$0.024
Base Fee (High Context)
per request
$0.034

Pixverse v5

PixVerseVideo Generation
Proprietary Endpoint

Cinematic video generation in Text-to-Video, Image-to-Video, and Transition modes with high detail, fluid motion, and lifelike animations.

Type
Spec
Rate
360p/540p 5s
per video
$0.45
360p/540p 8s
per video
$0.90
720p 5s
per video
$0.60
720p 8s
per video
$1.20
1080p 5s
per video
$1.20

Pixverse v5.6

PixVerseVideo Generation
Proprietary Endpoint

Generates videos from text or 1-2 frame image prompts up to 1080p, multiple aspect ratios, 5-10s durations, with optional synchronized audio.

Type
Spec
Rate
360p/540p 5s no audio
per video
$0.40
360p/540p 5s audio
per video
$0.80
360p/540p 8s no audio
per video
$0.80
360p/540p 8s audio
per video
$1.60
360p/540p 10s no audio
per video
$0.88
360p/540p 10s audio
per video
$1.76
720p 5s no audio
per video
$0.65
720p 5s audio
per video
$1.30
720p 8s no audio
per video
$1.30
720p 8s audio
per video
$2.60
720p 10s no audio
per video
$1.43
720p 10s audio
per video
$2.86
1080p 5s no audio
per video
$0.75
1080p 5s audio
per video
$1.50
1080p 8s no audio
per video
$1.50
1080p 8s audio
per video
$3.00

Qwen Image 2.0

Alibaba CloudSingaporeImage Generation
Proprietary Endpoint
Save up to 8%

Unified image generation and editing model with class-leading complex Chinese/English text rendering, realistic textures, and multi-image fusion.

Type
Spec
Rate
Standard
per image
$0.035$0.0322
Pro
per image
$0.075$0.069

Qwen3.5 Flash

Alibaba CloudSingaporeCtx 1MText Generation
Proprietary Endpoint
Save up to 10%

Vision-language model with hybrid linear-attention plus sparse MoE, 1M context, and fast multimodal text/image/video inference.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.10$0.090
Output
per 1M generated tokens
$0.40$0.368
Web Search
per call
$0.015
Image Search
per call
$0.012

Qwen3.5 Flash

Alibaba CloudChinaCtx 1MText Generation
Proprietary Endpoint
Save up to 68%

Vision-language model with hybrid linear-attention plus sparse MoE, 1M context, and fast multimodal text/image/video inference.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.090<=128K $0.029128K-256K $0.115256K-1M $0.172
Output
per 1M generated tokens
$0.368<=128K $0.287128K-256K $1.147256K-1M $1.72
Web search
per query when enabled
$0.01

Qwen3.5 Omni Flash

Alibaba CloudSingaporeCtx 256KText Generation
Proprietary Endpoint

Cost-efficient omni-modal model handling text, image, audio, and video, with up to 3 hours of audio and 1 hour of video across 90+ languages.

Type
Spec
Rate
Input
per 1M prompt tokens
per 1M prompt tokens $0.40per 1M prompt tokens $3.00
Output
per 1M generated tokens
per 1M generated tokens $2.20per 1M generated tokens $11.90
Web Search
per request
$0.015

Qwen3.5 Omni Plus

Alibaba CloudSingaporeCtx 256KText Generation
Proprietary Endpoint

Flagship omni-modal model for text, image, audio, and video. 3h audio, 1h video, 90+ input and 30+ output languages, 55 voice timbres.

Type
Spec
Rate
Input
per 1M prompt tokens
per 1M prompt tokens $1.40per 1M prompt tokens $11.00
Output
per 1M generated tokens
per 1M generated tokens $8.30per 1M generated tokens $44.00
Web Search
per request
$0.015

Qwen3.5 Plus

Alibaba CloudSingaporeCtx 1MText Generation
Proprietary Endpoint
Save up to 10%

Multimodal model with hybrid architecture for efficient deep thinking and visual understanding across text, image, and video on a 1M context.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.40<=256K $0.36$1.20256K-1M $1.08
Output
per 1M generated tokens
$2.40<=256K $2.21$7.20256K-1M $6.62
Web Search
per call
$0.015
Image Search
per call
$0.012

Qwen3.5 Plus

Alibaba CloudChinaCtx 1MText Generation
Proprietary Endpoint
Save up to 69%

Multimodal model with hybrid architecture for efficient deep thinking and visual understanding across text, image, and video on a 1M context.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.36<=128K $0.115$0.36128K-256K $0.287$1.08256K-1M $0.573
Output
per 1M generated tokens
$2.21<=128K $0.688$2.21128K-256K $1.72$6.62256K-1M $3.44
Web search
per query when enabled
$0.01

Qwen3.6 Max Preview

Alibaba CloudSingaporeCtx 256KText Generation
Proprietary Endpoint

Largest preview variant in the 3.6 series (text-only): improved coding agent execution, stronger front-end skills, and broader long-tail knowledge.

Type
Spec
Rate
Input
per 1M prompt tokens
<=128K $1.31128K-256K $1.97
Output
per 1M generated tokens
<=128K $7.88128K-256K $11.82
Web Search
per call
$0.020

Qwen3.6 Plus

Alibaba CloudSingaporeCtx 1MText Generation
Proprietary Endpoint

Vision-language model with major upgrades over 3.5: agentic and front-end coding, multimodal recognition, OCR, and object localization.

Type
Spec
Rate
Input
per 1M prompt tokens
<=256K $0.50256K-1M $2.00
Output
per 1M generated tokens
<=256K $3.00256K-1M $6.00
Web Search
per call
$0.026
Image Search
per call
$0.0208

Qwen3.6 Plus

Alibaba CloudChinaCtx 1MText Generation
Proprietary Endpoint
Save up to 45%

Vision-language model with major upgrades over 3.5: agentic and front-end coding, multimodal recognition, OCR, and object localization.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.50<=256K $0.276$2.00256K-1M $1.101
Output
per 1M generated tokens
$3.00<=256K $1.651256K-1M $6.602
Web search
per query when enabled
$0.01

Qwen3 Max

Alibaba CloudSingaporeCtx 256KText Generation
Proprietary Endpoint
Save up to 10%

256K-context flagship with major improvements in reasoning, instruction following, and multilingual support, plus higher coding/math accuracy.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.20<=32K $1.08$2.4032K-128K $2.16$3.00128K-256K $2.70
Output
per 1M generated tokens
$6.00<=32K $5.52$12.0032K-128K $11.04$15.00128K-256K $13.80
Web Search
per request
$0.015

Qwen3 Max Preview

Alibaba CloudSingaporeCtx 256KText Generation
Proprietary Endpoint
Save up to 20%

Preview release with major gains over the 2.5 series in Chinese-English understanding, complex instructions, multilingual ability, and tool use.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.20<=32K $1.08$2.4032K-128K $2.16$3.00128K-256K $2.70
Output
per 1M generated tokens
$6.00<=32K $4.80$12.0032K-128K $9.60$15.00128K-256K $12.00

Qwen3 Max Thinking

Alibaba CloudSingaporeCtx 256KText Generation
Proprietary Endpoint
Save up to 10%

Reasoning model with adaptive tool use (search, memory, code interpreter) and test-time scaling for higher accuracy on complex tasks.

Type
Spec
Rate
Input
per 1M prompt tokens
$1.20<=32K $1.08$2.4032K-128K $2.16$3.00128K-256K $2.70
Output
per 1M generated tokens
$6.00<=32K $5.52$12.0032K-128K $11.04$15.00128K-256K $13.80
Web Search
per request
$0.015

Qwen3 Rerank

Alibaba CloudSingaporeCtx 4000Rerankers
Proprietary Endpoint

Semantic document reranker. Sorts up to 500 candidates per query by relevance, supports 100+ languages, and accepts a custom sorting instruction.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.10

Seed 2.0 Code

ByteDanceMalaysiaCtx 256KText Generation
Proprietary Endpoint

Coding-tuned 256K-context model with strong front-end results and multilingual programming support for AI coding tools and agents.

Type
Spec
Rate
Input
per 1M prompt tokens
<=128K $0.40128K-256K $0.80
Output
per 1M generated tokens
<=128K $2.40128K-256K $4.80

Seed 2.0 Lite

ByteDanceMalaysiaCtx 256KText Generation
Proprietary Endpoint

Balanced general-purpose model for high-frequency enterprise workloads: information processing, content, search, and data analysis.

Type
Spec
Rate
Input
per 1M prompt tokens
<=128K $0.31128K-256K $0.62
Output
per 1M generated tokens
<=128K $2.50128K-256K $5.00

Seed 2.0 Mini

ByteDanceMalaysiaCtx 256KText Generation
Proprietary Endpoint

Latency-focused multimodal model with 256K context, four reasoning effort modes, and image/video understanding for high-concurrency use.

Type
Spec
Rate
Input
per 1M prompt tokens
<=128K $0.12128K-256K $0.24
Output
per 1M generated tokens
<=128K $0.50128K-256K $1.00

Seed 2.0 Pro

ByteDanceMalaysiaCtx 256KText Generation
Proprietary Endpoint

Flagship general model with 256K context for complex reasoning, multimodal understanding, structured generation, and tool-augmented execution.

Type
Spec
Rate
Input
per 1M prompt tokens
<=128K $0.63128K-256K $1.26
Output
per 1M generated tokens
<=128K $3.79128K-256K $7.58

Seedance 2.0 Fast

ByteDanceMalaysiaVideo Generation
Proprietary Endpoint

Speed-optimized 2.0 video variant for cinematic clips with native audio sync, camera control, and stable motion at lower cost per render.

Type
Spec
Rate
T2V/I2V 480P
per second
$0.122
T2V/I2V 720P
per second
$0.260
Video Input 480P
per second
$0.284
Video Input 720P
per second
$0.610

Seedance 2.0 Pro

ByteDanceMalaysiaVideo Generation
Proprietary Endpoint

Multimodal video model for cinematic output from text, image, audio, or video inputs, with stable motion and consistent characters.

Type
Spec
Rate
T2V/I2V 480P
per second
$0.139
T2V/I2V 720P
per second
$0.300
T2V/I2V 1080P
per second
$0.749
Video Input 480P
per second
$0.342
Video Input 720P
per second
$0.736
Video Input 1080P
per second
$1.841

Seedream 5.0 Lite

ByteDanceMalaysiaImage Generation
Proprietary Endpoint

Unified multimodal image model that reasons through prompts before rendering, producing high-resolution and consistent edits and brand visuals.

Type
Spec
Rate
Standard
per image
$0.0350

Stable Audio 2.0

Stability AIAudio Generation
Proprietary Endpoint

Generates audio up to 3 minutes from text prompts, supporting text-to-audio and audio-to-audio with adjustable duration, steps, and CFG scale.

Type
Spec
Rate
Base Cost
per generation
$0.58
Per Step Cost
per step
$0.00

Stable Audio 2.5

Stability AIAudio Generation
Proprietary Endpoint

Up-to-3-minute audio from text with text-to-audio, audio-to-audio, and audio inpainting for music production, sound design, and remixing.

Type
Spec
Rate
Generation
per generation
$0.68

Tavily Research

TavilyResearch & Search
Proprietary Endpoint

Multi-search research assistant that explores a topic, analyzes sources, and produces a detailed research report with citations.

Type
Spec
Rate
Mini
average per task
~$1.19
Pro
average per task
~$2.75

Text Embedding v4

Alibaba CloudSingaporeCtx 8192Embeddings
Proprietary Endpoint

Multilingual text embedding with selectable output dimensions (64–2048). Up to 8,192 tokens per input.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.07

Tongyi Embedding Vision Flash

Alibaba CloudSingaporeCtx 1024Embeddings
Proprietary Endpoint

Speed-optimised multimodal embedding — same shape as Vision-Plus, 3× cheaper image/video tokens.

Type
Spec
Rate
Text input
per 1M tokens
$0.09
Image / video input
per 1M tokens
$0.03

Tongyi Embedding Vision Plus

Alibaba CloudSingaporeCtx 1024Embeddings
Proprietary Endpoint

Multimodal embedding producing independent vectors for text, image, and video inputs.

Type
Spec
Rate
Text input
per 1M tokens
$0.09
Image / video input
per 1M tokens
$0.09

Wan 2.6

Alibaba CloudSingaporeVideo Generation
Proprietary Endpoint
Save up to 10%

Multimodal video generation model for cinematic, multi-shot stories with native audio-visual sync (lip-sync, dialogue, music, SFX).

Type
Spec
Rate
Standard 720P
per second
$0.10$0.09
Standard 1080P
per second
$0.15$0.138
Flash 720P (audio)
per second
$0.050$0.045
Flash 720P (no audio)
per second
$0.0250$0.0225
Flash 1080P (audio)
per second
$0.0750$0.069
Flash 1080P (no audio)
per second
$0.03750$0.0345

Wan 2.7

Alibaba CloudSingaporeVideo Generation
Proprietary Endpoint

Multimodal video model supporting T2V, I2V, video editing, and reference-to-video, with high-fidelity output from text, image, or video inputs.

Type
Spec
Rate
All Modes 720P
per second
$0.10
All Modes 1080P
per second
$0.150

Wan2.7 Image

Alibaba CloudSingaporeImage Generation
Proprietary Endpoint

Image generation and editing companion model: text-to-image, bounding-box edits, and cohesive image sets, with up to 4K output on Pro.

Type
Spec
Rate
Standard
per image
$0.030
Pro
per image
$0.075

Manus

ManusTools & Agents
Proprietary Endpoint

Autonomous AI agent that turns a high-level prompt into subtasks, calls tools and APIs, and delivers end-to-end results without manual orchestration.

Type
Spec
Rate
Adaptive - Manus 1.6 Lite
per task
$1.44 - $2.63
Adaptive - Manus 1.6
per task
$2.89 - $5.25
Adaptive - Manus 1.6 Max
per task
$5.25 - $9.19

Grok Imagine Video 1.5

xAIVideo Generation
Proprietary Endpoint

Image-to-video model that animates a source image with prompt-guided motion, up to 15 seconds at 480p or 720p across seven aspect ratios.

Type
Spec
Rate
Image input
per image
$0.05
480p
per second
$0.096
720p
per second
$0.168

MiMo V2.5 Pro

XiaomiCtx 1MText Generation
Proprietary Endpoint

Top-tier model for agentic workflows, complex software engineering, and long-horizon tasks, sustaining work across 1000+ tool calls on 1M context.

Type
Spec
Rate
Input
per 1M prompt tokens
$2.175
Output
per 1M generated tokens
$4.35
Implicit cache read
per 1M cached input tokens
$0.018
Web Search
per call
$0.015

MiMo V2.5

XiaomiCtx 1MText Generation
Proprietary Endpoint

Multimodal model with native visual and audio understanding on a 1M context, designed to reason and act across modalities in agentic workflows.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.70
Output
per 1M generated tokens
$1.40
Implicit cache read
per 1M cached input tokens
$0.014
Web Search
per call
$0.015

MiMo V2 Flash

XiaomiCtx 256KText Generation
Proprietary Endpoint

Lightweight, high-speed reasoning model with hybrid attention and multi-token prediction for low-cost inference and strong benchmark scores.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.50
Output
per 1M generated tokens
$1.50
Implicit cache read
per 1M cached input tokens
$0.05
Web Search
per call
$0.015

95 of 119 models

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.

Metric

Specification

Price (per 1M Tokens)

No items found.