TTS 1.5 Max API: Pricing, Playground & Docs

Q: Which endpoint does TTS 1.5 Max use?

TTS 1.5 Max is served through POST /v1/audio/speech on api.empiriolabs.ai with standard bearer-token authentication.

About TTS 1.5 Max

Broadcast-quality voice synthesis with rich expressive prosody, 271+ voices across 15 languages, and real-time SSE streaming with per-word timestamps.

Also known as TTS Max, Inworld TTS 1.5 Max, TTS-1.5-Max, tts-1-5-max

multi speakerreal timestreamingword timestampscharacter timestampsmultilingualexpressive prosodybroadcast quality

TTS 1.5 Max specs

Model ID: tts-1-5-max
Provider: Inworld
Category: Audio Generation
Released: Jan 21, 2026
Input: Text
Output: Audio
Endpoints: POST/v1/audio/speechPOST/v1/audio/speech:streamGET/v1/voices
Alternate model IDs: inworld-tts-1.5-maxtts-1.5-max

TTS 1.5 Max API pricingSave up to 15%

Live pay-as-you-go rates from the EmpirioLabs catalog. You are billed only for what you use, with no monthly minimum.

Type

Spec

Rate

Synthesis

per 1M characters

$35.00$29.75

Compare on the full pricing page

How to call the TTS 1.5 Max API

TTS 1.5 Max serves speech through POST /v1/audio/speech and returns playable audio. Send the text to speak as input with the model id tts-1-5-max. Get an API key from the EmpirioLabs dashboard.

cURL

curl https://api.empiriolabs.ai/v1/audio/speech \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1-5-max",
    "input": "Welcome to EmpirioLabs. Your build just finished."
  }' \
  --output speech.mp3

Python

import requests

response = requests.post(
    "https://api.empiriolabs.ai/v1/audio/speech",
    headers={"Authorization": "Bearer YOUR_EMPIRIOLABS_API_KEY"},
    json={"model": "tts-1-5-max", "input": "Welcome to EmpirioLabs."},
)
with open("speech.mp3", "wb") as f:
    f.write(response.content)

Full TTS 1.5 Max API reference

TTS 1.5 Max API parameters

Request parameters supported by the TTS 1.5 Max API on EmpirioLabs. Defaults apply when a field is omitted.

Parameter	Type	Default	Range / values	Description
input	string	-	max 2000	Text to synthesize. Max 2,000 characters per request; chunk longer copy at sentence boundaries on the client.
voice	enum	Sarah	Sarah, Olivia, Elizabeth, Ashley, Wendy, Julia, Priya, Pixie,...	Voice preset. 20 hand-picked voices covering English + Spanish + Portuguese + Hindi + various accents. For the full 271-voice catalog (including cloned voices), use...
voice_id	string	-	-	Free-form voice ID. Overrides voice when set. Use this to address voices outside the curated 20-preset list. Inworld TTS 1.5 ships 271+ named voices across 15...
language	enum	en-US	en-US, en-GB, es-ES, es-MX, fr-FR, de-DE, it-IT, pt-BR, pt-PT...	BCP-47 language code. Inworld TTS 1.5 covers 15 languages.
output_format	enum	WAV	MP3, WAV, OGG, FLAC, PCM, ALAW, MULAW	Audio container/codec. WAV = LINEAR16 inside RIFF (ubiquitous). MP3 / OGG = compressed. PCM = headerless raw, useful for chunked real-time playback. FLAC = lossless.
sample_rate	enum	24000	8000, 16000, 22050, 24000, 32000, 44100, 48000	Output sample rate in Hz. 24000 is Inworld's default and what their voice models train at; raise to 48000 for broadcast quality.
speed	number	1	0.5 to 1.5	Speaking rate multiplier. 0.5 = half speed, 1.5 = 50% faster.
temperature	number	1	0.1 to 2	Voice expressiveness / variability. Lower = more consistent / "flat"; higher = more expressive but more variation between renders.
bit_rate	number	128000	32000 to 320000	Bitrate in bps for MP3 / OGG_OPUS. Ignored for other encodings.
apply_text_normalization	enum	ON	ON, OFF	When ON, Inworld expands numbers / abbreviations / dates into spoken form ("USD 5" → "five US dollars").
timestamp_type	enum	NONE	NONE, WORD, CHARACTER	If non-NONE, the response includes per-word or per-character timestamps in timestamp_info. Useful for caption / highlight UIs.

Good to know

Limits

Max input: 2,000 characters per request (chunk longer text at sentence boundaries)
WebSocket: 20 concurrent connections, 5 contexts/connection
Per-WS message: 1,000 characters

Latency

p90 TTFB: under 250 ms (Inworld benchmark)

Voices

271+ named presets across 15 languages
20 hand-picked presets exposed in the dropdown; pass any other voice ID via voice_id

TTS 1.5 Max API: common questions

How much does the TTS 1.5 Max API cost?

On EmpirioLabs, TTS 1.5 Max is billed pay as you go. The live rate card on this page always matches what the API charges.

Which endpoint does TTS 1.5 Max use?

TTS 1.5 Max is served through POST /v1/audio/speech on api.empiriolabs.ai with standard bearer-token authentication.

Can I try TTS 1.5 Max in the browser before integrating?

Yes. The EmpirioLabs playground runs TTS 1.5 Max in the browser with the same parameters the API exposes, so you can test prompts before writing code.

How do I get a TTS 1.5 Max API key?

Create an EmpirioLabs account, then generate a key under API Keys in the dashboard. Billing is pay-as-you-go credits, so you only pay for the requests you make.

TTS 1.5 Max API

About TTS 1.5 Max

TTS 1.5 Max specs

TTS 1.5 Max API pricingSave up to 15%

How to call the TTS 1.5 Max API

TTS 1.5 Max API parameters

Good to know

Limits

Latency

Voices

TTS 1.5 Max API: common questions

How much does the TTS 1.5 Max API cost?

Which endpoint does TTS 1.5 Max use?

Can I try TTS 1.5 Max in the browser before integrating?

How do I get a TTS 1.5 Max API key?

More Audio Generation model APIs

ACE-Step 1.5 XL

TTS 2

TTS 1.5 Mini

Gemini 2.5 Flash TTS

Gemini 2.5 Pro TTS

Gemini 3.1 Flash TTS

Ready to use better endpoints?