Explore our production ready AI Models

Browse the full catalog of models across text, image, audio, video, 3D, and more.

Model catalog

AI models on one OpenAI-compatible API.

Browse text, image, video, audio, 3D, search, and agent endpoints with pay-as-you-go pricing. The interactive catalog loads current availability from EmpirioLabs, and these model docs are crawlable without client JavaScript.

Open model docs

New & Featured

Save up to 31%

Qwen3.7 Plus

Alibaba Cloud
Proprietary EndpointNew

Cost-effective Qwen3.7 vision-language model for text, image, video, coding, tool use, GUI understanding, and 1M-context workflows.

Singapore1M context
China1M context
Save up to 25%

MiniMax M3

MiniMax
Proprietary EndpointNew

MiniMax M3 is a multimodal reasoning model for coding, agents, and long-context analysis with text, image, and video input.

Singapore524K context

Grok Imagine Video 1.5

xAI
Proprietary EndpointNew

Image-to-video model that animates a source image with prompt-guided motion, up to 15 seconds at 480p or 720p across seven aspect ratios.

Save up to 34%

Qwen3.7 Max

Alibaba Cloud
Proprietary EndpointNew

Qwen3.7 Max is a flagship text model for coding, productivity, long-running agents, deep thinking, tools, and 1M-token context.

Singapore1M context
China1M context
Save up to 7%

Kimi K2.6

Moonshot AI
Proprietary EndpointNew

Kimi K2.6 is a Moonshot multimodal reasoning model with 256K context, strong coding, and text, image, and video inputs.

China256K context
Save up to 41%

GLM-5.1

Z.ai
Proprietary EndpointNew

Long-context Zhipu AI reasoning model with 202K context, 128K output, tool calling, structured output, and cache support.

China202K context

Text Generation52

Save up to 31%

Qwen3.7 Plus

Alibaba Cloud
Proprietary EndpointNew

Cost-effective Qwen3.7 vision-language model for text, image, video, coding, tool use, GUI understanding, and 1M-context workflows.

Singapore1M context
China1M context
Save up to 25%

MiniMax M3

MiniMax
Proprietary EndpointNew

MiniMax M3 is a multimodal reasoning model for coding, agents, and long-context analysis with text, image, and video input.

Singapore524K context
Save up to 34%

Qwen3.7 Max

Alibaba Cloud
Proprietary EndpointNew

Qwen3.7 Max is a flagship text model for coding, productivity, long-running agents, deep thinking, tools, and 1M-token context.

Singapore1M context
China1M context
Save up to 50%

MiniMax M2.7 Highspeed

MiniMax
Proprietary Endpoint

High-speed M2.7 variant tuned for fast inference with strong general-purpose performance with strong agentic capabilities.

Singapore200K context
Save up to 41%

GLM-5.1

Z.ai
Proprietary EndpointNew

Long-context Zhipu AI reasoning model with 202K context, 128K output, tool calling, structured output, and cache support.

China202K context
Save up to 7%

Kimi K2.6

Moonshot AI
Proprietary EndpointNew

Kimi K2.6 is a Moonshot multimodal reasoning model with 256K context, strong coding, and text, image, and video inputs.

China256K context

Image Generation7

Save up to 39%

FLUX.2 Klein 4B

Black Forest Labs
Native InferenceNew

Apache-licensed 4B FLUX.2 Klein image generation and editing model with text-to-image, reference-image editing, and creative workflow support.

Amazon Nova Canvas

Amazon
Proprietary Endpoint

Image generation and editing model creating and modifying images from text or image inputs, with inpainting, virtual try-on, and style controls.

Hunyuan Image 3

Tencent
Proprietary Endpoint

Open-source text-to-image model on a multimodal Mixture-of-Experts architecture with photorealistic detail and strong multilingual text rendering.

Janus-Pro DeepSeek

DeepSeek
Proprietary Endpoint

Autoregressive framework on the Janus Pro 7B model that unifies multimodal understanding and image generation in one architecture.

Save up to 8%

Qwen Image 2.0

Alibaba Cloud
Proprietary Endpoint

Unified image generation and editing model with class-leading complex Chinese/English text rendering, realistic textures, and multi-image fusion.

Singapore

Seedream 5.0 Lite

ByteDance
Proprietary EndpointNew

Unified multimodal image model that reasons through prompts before rendering, producing high-resolution and consistent edits and brand visuals.

Malaysia

Video Generation14

Amazon Nova Reel 1.1

Amazon
Proprietary Endpoint

Video generation model producing up to 2-minute multi-shot videos from text and optional image prompts with improved quality and consistency.

HappyHorse 1.0

Alibaba Cloud
Proprietary EndpointNew

Video model offering Text-to-Video, Image-to-Video, Reference-to-Video, and Video Edit modes with high-fidelity, motion-smooth output.

Singapore
Save up to 19%

Hunyuan Video 1.5

Tencent
Native Inference

8.3B-parameter video model with native 720p output (upscalable to 1080p), strong motion coherence, and bilingual prompt understanding up to 10s.

Kling O3

Kling AI
Proprietary Endpoint

Video model in Standard or Pro modes with Text-to-Video, Image-to-Video, Reference-to-Video, editing, native sound, and multi-scene transitions.

Kling v3 Motion Control

Kling AI
Proprietary Endpoint

Kling 3.0 model that transfers motion from a reference video onto a character from a reference image, with Standard 720p and Pro 1080p tiers.

MOSS Video and Audio

OpenMOSS
Native Inference

Open-source 32B MoE foundation model that generates synchronized video and audio in one inference step with precise dual-tower lip-sync.

Audio Generation10

Save up to 17%

ACE-Step 1.5 XL

ACE-Step
Native InferenceNew

Open-source music generation model for text-to-song and lyric-guided audio, with fast 8-step XL Turbo inference for controllable song iteration.

Save up to 30%

TTS 1.5 Mini

Inworld
Proprietary EndpointNew

Sub-130ms TTFB voice synthesis with 271+ voices across 15 languages, expressive prosody, and real-time SSE streaming for low-latency voice agents.

Save up to 15%

TTS 1.5 Max

Inworld
Proprietary EndpointNew

Broadcast-quality voice synthesis with rich expressive prosody, 271+ voices across 15 languages, and real-time SSE streaming with per-word timestamps.

Gemini 2.5 Flash TTS

Google
Proprietary Endpoint

Low-latency text-to-speech with single- and multi-speaker voices and controllable style, accent, and expressive tone for production apps.

Gemini 2.5 Pro TTS

Google
Proprietary Endpoint

High-quality TTS preview for podcasts, audiobooks, and customer support, with expressive multi-speaker voices across 23+ languages.

Gemini 3.1 Flash TTS

Google
Proprietary EndpointNew

Highly controllable TTS with new Audio Tags for precise style, tone, pace, and delivery across narration, assistants, and voice apps.

Transcription3

Deepgram Nova 3

Deepgram
Proprietary Endpoint

Speech-to-text transcription using the Nova-3 model with multi-language support and advanced customizable settings for production workloads.

OpenAI Whisper 1

OpenAI
Proprietary Endpoint

Whisper-1 speech-to-text transcription trained on multilingual supervised audio, with a 25 MB upload limit per file.

Save up to 17%

Whisper Large v3 Turbo

OpenAI
Native InferenceNew

Controlled Whisper Large v3 Turbo transcription with multilingual ASR, translation, VAD, timestamps, subtitles, hotwords, and decoder controls.

Research & Search14

Exa Answer

Exa
Proprietary Endpoint

Quick LLM-style answer to a natural-language question, grounded in fresh Exa web search results with inline citations and source links.

Exa Research

Exa
Proprietary Endpoint

Asynchronous research task that explores the web, gathers sources, synthesizes findings, and returns cited answers for in-depth queries.

Linkup Standard

Linkup
Proprietary Endpoint

AI-powered web search with detailed overviews and answers, faster than Deep Search. Ranks #1 on OpenAI SimpleQA benchmark.

100K context

Perplexity Advanced Deep Research

Perplexity
Proprietary Endpoint

Institutional-grade research powered by Claude Opus 4.6 reasoning, with maximum depth, enhanced tool access, and extensive source coverage.

3D Generation1

Save up to 90%

TRELLIS.2 4B

Microsoft
Native InferenceNew

TRELLIS.2 image-to-3D model that turns a reference image into a textured GLB asset with resolution, seed, mesh, texture, and export controls.

Embeddings3

Text Embedding v4

Alibaba Cloud
Proprietary EndpointNew

Multilingual text embedding with selectable output dimensions (64–2048). Up to 8,192 tokens per input.

Singapore8192 context

Tongyi Embedding Vision Flash

Alibaba Cloud
Proprietary EndpointNew

Speed-optimised multimodal embedding — same shape as Vision-Plus, 3× cheaper image/video tokens.

Singapore1024 context

Tongyi Embedding Vision Plus

Alibaba Cloud
Proprietary EndpointNew

Multimodal embedding producing independent vectors for text, image, and video inputs.

Singapore1024 context

Rerankers1

Qwen3 Rerank

Alibaba Cloud
Proprietary EndpointNew

Semantic document reranker. Sorts up to 500 candidates per query by relevance, supports 100+ languages, and accepts a custom sorting instruction.

Singapore4000 context

Tools & Agents2

GPTZero

GPTZero
Proprietary Endpoint

Deep-learning detector that flags portions of text likely generated by AI versus human, classifying content as entirely human, AI, or mixed.

Manus

Manus
Proprietary Endpoint

Autonomous AI agent that turns a high-level prompt into subtasks, calls tools and APIs, and delivers end-to-end results without manual orchestration.

No items found.
No items found.
No items found.
No items found.
No items found.
No items found.
No items found.
No items found.
No items found.
No items found.
No items found.
No items found.
No items found.

Ready to use better endpoints?

Explore our models, or contact us about business inquiries, custom deployments, or anything else.