
Generates audio up to 3 minutes from text prompts, supporting text-to-audio and audio-to-audio with adjustable duration, steps, and CFG scale.
Generates audio up to 3 minutes from text prompts, supporting text-to-audio and audio-to-audio with adjustable duration, steps, and CFG scale.
Also known as Stable Audio, Stable-Audio-2.0
stable-audio-2-0POST /v1/audio/generationsLive pay-as-you-go rates from the EmpirioLabs catalog. You are billed only for what you use, with no monthly minimum.
Stable Audio 2.0 runs through POST /v1/audio/generations. The request returns a job_id right away; poll GET /v1/jobs/{job_id} until the job completes and read the output URLs from the result. Get an API key from the EmpirioLabs dashboard.
curl https://api.empiriolabs.ai/v1/audio/generations \
-H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "stable-audio-2-0",
"prompt": "Describe what you want Stable Audio 2.0 to generate."
}'curl https://api.empiriolabs.ai/v1/jobs/JOB_ID \
-H "Authorization: Bearer $EMPIRIOLABS_API_KEY"import requests
response = requests.post(
"https://api.empiriolabs.ai/v1/audio/generations",
headers={"Authorization": "Bearer YOUR_EMPIRIOLABS_API_KEY"},
json={
"model": "stable-audio-2-0",
"prompt": "Describe what you want Stable Audio 2.0 to generate.",
},
)
job = response.json()
# Generation runs as an async job. Poll until it completes.
import time
while True:
status = requests.get(
f"https://api.empiriolabs.ai/v1/jobs/{job['job_id']}",
headers={"Authorization": "Bearer YOUR_EMPIRIOLABS_API_KEY"},
).json()
if status.get("status") in ("completed", "failed"):
print(status)
break
time.sleep(5)Request parameters supported by the Stable Audio 2.0 API on EmpirioLabs. Defaults apply when a field is omitted.
| Parameter | Type | Default | Range / values | Description |
|---|---|---|---|---|
| prompt | string | - | - | What to generate. Be specific about genre, instruments, mood, and tempo. |
| mode | enum | text-to-audio | text-to-audio, audio-to-audio | text-to-audio: generate from prompt only. audio-to-audio: condition on a reference clip. |
| output_format | enum | mp3 | mp3, wav | Output media file format (mp3, wav, mp4, png, jpg, etc., depending on the endpoint). |
| duration | number | 190 | 1 to 190 | Seconds. Stability Audio 2.0 generates up to 3 minutes 10 seconds. |
| steps | number | 50 | 30 to 100 | Diffusion steps. More = higher fidelity, slower (and adds per-step credits). |
| cfg_scale | number | 7 | 1 to 25 | Classifier-free guidance. Higher = follows prompt more strictly. |
| strength | number | 1 | 0 to 1 | Audio-to-audio only. 0 = ignore reference, 1 = stay close to reference. |
| random_seed | boolean | true | - | If true, use a random seed each call. |
| seed | number | - | - | Reproducibility seed. Only used when random_seed=false. |
| audio_url | string | - | - | Reference audio URL for audio-to-audio mode. |
Generates up to 3 minutes of audio from text or via audio-to-audio transformation.
On EmpirioLabs, Stable Audio 2.0 is billed pay as you go: Base Cost $0.58 per generation; Per Step Cost $0.00 per step. The live rate card on this page always matches what the API charges.
Stable Audio 2.0 is served through POST /v1/audio/generations on api.empiriolabs.ai with standard bearer-token authentication.
Yes. The EmpirioLabs playground runs Stable Audio 2.0 in the browser with the same parameters the API exposes, so you can test prompts before writing code.
Create an EmpirioLabs account, then generate a key under API Keys in the dashboard. Billing is pay-as-you-go credits, so you only pay for the requests you make.
Explore our models, or contact us about business inquiries, custom deployments, or anything else.