Qwen3 Rerank API

Semantic document reranker. Sorts up to 500 candidates per query by relevance, supports 100+ languages, and accepts a custom sorting instruction.

Alibaba CloudRerankers4000 contextSingaporeProprietary EndpointNew

About Qwen3 Rerank

Semantic document reranker. Sorts up to 500 candidates per query by relevance, supports 100+ languages, and accepts a custom sorting instruction.

semantic rankingmultilingualragcustom instructions

Qwen3 Rerank specs

Model ID
qwen3-rerank
Provider
Alibaba Cloud
Category
Rerankers
Context window
4000 tokens
Input
text
Output
ranking
Region
Singapore
Endpoints
POST /v1/reranks

Qwen3 Rerank API pricing

Live pay-as-you-go rates from the EmpirioLabs catalog. You are billed only for what you use, with no monthly minimum.

Type
Spec
Rate
Input
per 1M prompt tokens
$0.10
Compare on the full pricing page

How to call the Qwen3 Rerank API

Qwen3 Rerank is served through POST /v1/reranks with the model id qwen3-rerank and your EmpirioLabs API key. Get an API key from the EmpirioLabs dashboard.

cURL
curl https://api.empiriolabs.ai/v1/reranks \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-rerank",
    "prompt": "Your input here."
  }'
Full Qwen3 Rerank API reference

Qwen3 Rerank API parameters

Request parameters supported by the Qwen3 Rerank API on EmpirioLabs. Defaults apply when a field is omitted.

ParameterTypeDefaultRange / valuesDescription
querystring--Query text to rank documents against. Max 4,000 tokens.
documentsarray--Candidate documents to sort (strings). Max 500 items, each up to 4,000 tokens.
top_nnumber101 to 500Number of top-ranked documents to return. Defaults to all.
instructstringGiven a web search query, retrieve relevant passages that answer the query.-Custom English instruction. Use "Retrieve semantically similar text." for similarity sorting.
return_documentsbooleanfalse-When true, return the original document text alongside each result.

Good to know

Per-request limits

  • Up to 500 candidate documents per request
  • Max 4,000 tokens per query/document
  • Max 120,000 tokens per request (formula: query_tokens × n_docs + sum_of_doc_tokens)
  • Tokens billed are query+documents combined; only successful reranks are charged

Languages

  • 100+ major languages including Chinese, English, Spanish, French, Portuguese, Indonesian, Japanese, Korean, German, Russian

Sorting modes (instruct parameter)

  • *Default — Q&A retrieval*: `Given a web search query, retrieve relevant passages that answer the query.`
  • *Semantic similarity*: `Retrieve semantically similar text.`
  • Or any custom English instruction (see [model task prompts](https://github.com/QwenLM/Qwen3-Embedding/blob/main/evaluation/task_prompts.json))

Qwen3 Rerank API: common questions

How much does the Qwen3 Rerank API cost?

On EmpirioLabs, Qwen3 Rerank is billed pay as you go: Input $0.10 per 1M prompt tokens. The live rate card on this page always matches what the API charges.

What is the context window of Qwen3 Rerank?

Qwen3 Rerank supports a 4000-token context window.

Which endpoint does Qwen3 Rerank use?

Qwen3 Rerank is served through POST /v1/reranks on api.empiriolabs.ai with standard bearer-token authentication.

Can I try Qwen3 Rerank in the browser before integrating?

Yes. The EmpirioLabs playground runs Qwen3 Rerank in the browser with the same parameters the API exposes, so you can test prompts before writing code.

How do I get a Qwen3 Rerank API key?

Create an EmpirioLabs account, then generate a key under API Keys in the dashboard. Billing is pay-as-you-go credits, so you only pay for the requests you make.

Ready to use better endpoints?

Explore our models, or contact us about business inquiries, custom deployments, or anything else.