Gemma 4 26B-A4B API: Pricing, Playground & Docs

About Gemma 4 26B-A4B

Gemma 4 26B A4B es un modelo multimodal abierto de Google con contexto 256K, texto, imagen y entrada de vídeo, herramientas y salida estructurada.

Soporta texto, imagen y entrada de vídeo, streaming, herramientas de función, salida estructurada JSON, control de semillas y modo de pensamiento por defecto. Use reasoning effort o thinking presupuesto para el pensamiento ligado, o enable thinking=false para respuestas directas. Las lecturas automáticas de caché se facturan a la tasa de entrada de caché cuando se informa por el servicio modelo. No se admiten controles de caché explícitos.

También conocido como Google Gemma 4 26B-A4B, Gemma-4-26B-A4B

reasoningvisionvideofunction callingstructured outputcachemultimodaljson modelogprobs

Gemma 4 26B-A4B specs

ID del modelo: gemma-4-26b-a4b
Proveedor: Google
Categoría: Generacion de texto
Released: Mar 31, 2026
Ventana de contexto: 256K tokens
Salida máxima: 32,768 tokens
Entrada: TextImageVideo
Salida: Text
Endpoints: POST /v1/chat/completions
POST /v1/responses
POST /v1/messages
POST /v1/completions

Gemma 4 26B-A4B API pricingSave up to 83%

Live pay-as-you-go rates from the EmpirioLabs catalog. You are billed only for what you use, with no monthly minimum.

Tipo

Especificación

Tarifa

Entrada

por señalización rápida 1M

$0.15$0.05

Producto

per 1M generados fichas

$0.50$0.29

Caché implícita

por fichas de entrada en caché de 1M

$0.15$0.025

Web Search (Linkup)

per call when invoked

$0.013

Comparar en la página completa de precios

How to call the Gemma 4 26B-A4B API

Gemma 4 26B-A4B serves the OpenAI-compatible Chat Completions API. Point any OpenAI SDK at https://api.empiriolabs.ai/v1 with your EmpirioLabs API key and use the model id gemma-4-26b-a4b. Get an API key from the EmpirioLabs dashboard.

cURL

curl https://api.empiriolabs.ai/v1/chat/completions \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemma-4-26b-a4b",
    "messages": [
      {"role": "user", "content": "Write a haiku about the ocean."}
    ]
  }'

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.empiriolabs.ai/v1",
    api_key="YOUR_EMPIRIOLABS_API_KEY",
)

response = client.chat.completions.create(
    model="gemma-4-26b-a4b",
    messages=[{"role": "user", "content": "Write a haiku about the ocean."}],
)
print(response.choices[0].message.content)

Full Gemma 4 26B-A4B API reference

Gemma 4 26B-A4B API parameters

Request parameters supported by the Gemma 4 26B-A4B API on EmpirioLabs. Defaults apply when a field is omitted.

Parámetro	Tipo	Predeterminado	Rango / valores	Descripción
temperature	number	1	0 to 2	Sampling temperature. Lower values are more deterministic.
top_p	number	0.95	0 to 1	Nucleus sampling probability mass.
max_tokens	integer	4096	1 to 32768	Maximum output tokens.
stop	string	-	-	One or more stop strings.
reasoning_effort	enum	medium	none, low, medium, high, max	Reasoning effort. none disables thinking; low, medium, high, and max set bounded thinking budgets.
enable_thinking	boolean	true	-	Enable the model reasoning channel before final output.
thinking_budget	integer	4096	128 to 32768	Maximum thinking tokens before the final answer. If max_tokens is lower, the service reserves room for the answer.
top_k	integer	20	1 to 200	Limit sampling to the top K candidate tokens when supported.
min_p	number	0	0 to 1	Minimum probability threshold for token sampling.
presence_penalty	number	0	-2 to 2	Penalty for tokens that already appeared in the generated text.
frequency_penalty	number	0	-2 to 2	Penalty based on how often a token has already appeared.
repetition_penalty	number	1	0.1 to 2	Penalty used by SGLang to reduce repeated text.
seed	integer	-	0 to 2147483647	Optional random seed for reproducible sampling.
logprobs	boolean	false	-	Return token log probabilities when supported.

8 more parameters in the docs

Información útil

Gemma 4 26B-A4B API: common questions

How much does the Gemma 4 26B-A4B API cost?

On EmpirioLabs, Gemma 4 26B-A4B is billed pay as you go: Input $0.05 (was $0.15) por señalización rápida 1M; Producto $0.29 (was $0.50) per 1M generados fichas; Caché implícita $0.025 (was $0.15) por fichas de entrada en caché de 1M. The live rate card on this page always matches what the API charges.

What is the context window of Gemma 4 26B-A4B?

Gemma 4 26B-A4B supports a 256K-token context window with up to 32,768 output tokens per response.

Is the Gemma 4 26B-A4B API OpenAI-compatible?

Yes. Gemma 4 26B-A4B serves the OpenAI-compatible Chat Completions API, so existing OpenAI SDKs work by pointing base_url at https://api.empiriolabs.ai/v1 and setting the model id to gemma-4-26b-a4b.

Can I try Gemma 4 26B-A4B in the browser before integrating?

Yes. The EmpirioLabs playground runs Gemma 4 26B-A4B in the browser with the same parameters the API exposes, so you can test prompts before writing code.

How do I get a Gemma 4 26B-A4B API key?

Create an EmpirioLabs account, then generate a key under API Keys in the dashboard. Billing is pay-as-you-go credits, so you only pay for the requests you make.

Gemma 4 26B-A4B API

About Gemma 4 26B-A4B

Gemma 4 26B-A4B specs

Gemma 4 26B-A4B API pricingSave up to 83%

How to call the Gemma 4 26B-A4B API

Gemma 4 26B-A4B API parameters

Información útil

Gemma 4 26B-A4B API: common questions

How much does the Gemma 4 26B-A4B API cost?

What is the context window of Gemma 4 26B-A4B?

Is the Gemma 4 26B-A4B API OpenAI-compatible?

Can I try Gemma 4 26B-A4B in the browser before integrating?

How do I get a Gemma 4 26B-A4B API key?

More Generacion de texto model APIs

GLM 5.2

Kimi K2.7 Code

Fugu Ultra

Qwen3.7 Plus

Kimi K2.7 Code Highspeed

MiniMax M3

Ready to use better endpoints?