MOSS Video and Audio API: Pricing, Playground & Docs

Q: Which endpoint does MOSS Video and Audio use?

MOSS Video and Audio is served through POST /v1/videos/generations on api.empiriolabs.ai with standard bearer-token authentication.

About MOSS Video and Audio

Open-source 32B MoE foundation model that generates synchronized video and audio in one inference step with precise dual-tower lip-sync.

Also known as OpenMOSS MOSS Video and Audio, MOSS-Video-and-Audio

audio synclipsync

MOSS Video and Audio specs

Model ID: moss-video-and-audio
Provider: OpenMOSS
Category: Video Generation
Released: Jan 29, 2026
Input: TextImage
Output: VideoAudio
Endpoints: POST/v1/videos/generations
Alternate model IDs: moss-video-audioopenmoss/video-audio

MOSS Video and Audio API pricing

Live pay-as-you-go rates from the EmpirioLabs catalog. You are billed only for what you use, with no monthly minimum.

Type

Spec

Rate

360p Video

per video

$0.17

720p Video

per video

$2.82

T2V Fast

additional fee

$0.065

T2V Quality

additional fee

$0.13

Compare on the full pricing page

How to call the MOSS Video and Audio API

MOSS Video and Audio runs through POST /v1/videos/generations. The request returns a job_id right away; poll GET /v1/jobs/{job_id} until the job completes and read the output URLs from the result. Get an API key from the EmpirioLabs dashboard.

cURL: submit the job

curl https://api.empiriolabs.ai/v1/videos/generations \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "moss-video-and-audio",
    "prompt": "Describe what you want MOSS Video and Audio to generate."
  }'

cURL: poll for the result

curl https://api.empiriolabs.ai/v1/jobs/JOB_ID \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY"

Python

import requests

response = requests.post(
    "https://api.empiriolabs.ai/v1/videos/generations",
    headers={"Authorization": "Bearer YOUR_EMPIRIOLABS_API_KEY"},
    json={
        "model": "moss-video-and-audio",
        "prompt": "Describe what you want MOSS Video and Audio to generate.",
    },
)
job = response.json()

# Generation runs as an async job. Poll until it completes.
import time
while True:
    status = requests.get(
        f"https://api.empiriolabs.ai/v1/jobs/{job['job_id']}",
        headers={"Authorization": "Bearer YOUR_EMPIRIOLABS_API_KEY"},
    ).json()
    if status.get("status") in ("completed", "failed"):
        print(status)
        break
    time.sleep(5)

Full MOSS Video and Audio API reference

MOSS Video and Audio API parameters

Request parameters supported by the MOSS Video and Audio API on EmpirioLabs. Defaults apply when a field is omitted.

Parameter	Type	Default	Range / values	Description
prompt	string	-	-	Scene description. With image attached, becomes an image-to-video prompt.
mode	enum	t2v	t2v, i2v	t2v: pure text-to-video. i2v: animate the attached image.
resolution	enum	720p	360p, 720p	720p uses a separate higher-VRAM endpoint.
aspect_ratio	enum	landscape	landscape, portrait	MOSS only supports landscape (16:9) and portrait (9:16).
duration	number	8	2 to 8	Clip length in seconds. The upstream model is hard-capped at 8s.
t2v_quality	enum	quality	fast, quality	Text-to-video only. fast trades fidelity for ~2× speed.
num_inference_steps	number	25	10 to 50	Diffusion steps. More = higher fidelity, slower.
cfg_scale	number	5	1 to 10	Classifier-free guidance. Higher = follows prompt more strictly.
sigma_shift	number	5	1 to 10	Schedule shift. Only valid when resolution=360p.
image	string	-	-	Reference image URL for i2v mode.
negative_prompt	string	-	-	What to avoid.
seed	number	-	-	Reproducibility seed.

Good to know

32B-parameter MoE with synchronized lip-sync video + audio in a single inference.

Constraints

Generation can take 20+ minutes
Image-to-Video typically yields superior results to text-to-video
Only 1 image supported (used as the first frame)
Video inputs NOT supported

Image formats

jpg, jpeg, png, webp, heic, heif, bmp, tiff, tif, gif

MOSS Video and Audio API: common questions

How much does the MOSS Video and Audio API cost?

On EmpirioLabs, MOSS Video and Audio is billed pay as you go: 360p Video $0.17 per video; 720p Video $2.82 per video; T2V Fast $0.065 additional fee. The live rate card on this page always matches what the API charges.

Which endpoint does MOSS Video and Audio use?

MOSS Video and Audio is served through POST /v1/videos/generations on api.empiriolabs.ai with standard bearer-token authentication.

Can I try MOSS Video and Audio in the browser before integrating?

Yes. The EmpirioLabs playground runs MOSS Video and Audio in the browser with the same parameters the API exposes, so you can test prompts before writing code.

How do I get a MOSS Video and Audio API key?

Create an EmpirioLabs account, then generate a key under API Keys in the dashboard. Billing is pay-as-you-go credits, so you only pay for the requests you make.

MOSS Video and Audio API

About MOSS Video and Audio

MOSS Video and Audio specs

MOSS Video and Audio API pricing

How to call the MOSS Video and Audio API

MOSS Video and Audio API parameters

Good to know

Constraints

Image formats

MOSS Video and Audio API: common questions

How much does the MOSS Video and Audio API cost?

Which endpoint does MOSS Video and Audio use?

Can I try MOSS Video and Audio in the browser before integrating?

How do I get a MOSS Video and Audio API key?

More Video Generation model APIs

Kling 3.0 Turbo

Amazon Nova Reel 1.1

HappyHorse 1.0

Hunyuan Video 1.5

Kling O3

Kling v3 Motion Control

Ready to use better endpoints?