MOSS Video and Audio API

Open-source 32B MoE foundation model that generates synchronized video and audio in one inference step with precise dual-tower lip-sync.

OpenMOSSVideo GenerationNative Inference

About MOSS Video and Audio

Open-source 32B MoE foundation model that generates synchronized video and audio in one inference step with precise dual-tower lip-sync.

audio synclipsync

MOSS Video and Audio specs

Model ID
moss-video-and-audio
Provider
OpenMOSS
Category
Video Generation
Input
text, image
Output
video, audio
Endpoints
POST /v1/videos/generations

MOSS Video and Audio API pricing

Live pay-as-you-go rates from the EmpirioLabs catalog. You are billed only for what you use, with no monthly minimum.

Type
Spec
Rate
360p Video
per video
$0.17
720p Video
per video
$2.82
T2V Fast
additional fee
$0.065
T2V Quality
additional fee
$0.13
Compare on the full pricing page

How to call the MOSS Video and Audio API

MOSS Video and Audio runs through POST /v1/videos/generations. The request returns a job_id right away; poll GET /v1/jobs/{job_id} until the job completes and read the output URLs from the result. Get an API key from the EmpirioLabs dashboard.

cURL: submit the job
curl https://api.empiriolabs.ai/v1/videos/generations \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "moss-video-and-audio",
    "prompt": "Describe what you want MOSS Video and Audio to generate."
  }'
cURL: poll for the result
curl https://api.empiriolabs.ai/v1/jobs/JOB_ID \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY"
Python
import requests

response = requests.post(
    "https://api.empiriolabs.ai/v1/videos/generations",
    headers={"Authorization": "Bearer YOUR_EMPIRIOLABS_API_KEY"},
    json={
        "model": "moss-video-and-audio",
        "prompt": "Describe what you want MOSS Video and Audio to generate.",
    },
)
job = response.json()

# Generation runs as an async job. Poll until it completes.
import time
while True:
    status = requests.get(
        f"https://api.empiriolabs.ai/v1/jobs/{job['job_id']}",
        headers={"Authorization": "Bearer YOUR_EMPIRIOLABS_API_KEY"},
    ).json()
    if status.get("status") in ("completed", "failed"):
        print(status)
        break
    time.sleep(5)
Full MOSS Video and Audio API reference

MOSS Video and Audio API parameters

Request parameters supported by the MOSS Video and Audio API on EmpirioLabs. Defaults apply when a field is omitted.

ParameterTypeDefaultRange / valuesDescription
promptstring--Scene description. With image attached, becomes an image-to-video prompt.
modeenumt2vt2v, i2vt2v: pure text-to-video. i2v: animate the attached image.
resolutionenum720p360p, 720p720p uses a separate higher-VRAM endpoint.
aspect_ratioenumlandscapelandscape, portraitMOSS only supports landscape (16:9) and portrait (9:16).
durationnumber82 to 8Clip length in seconds. The upstream model is hard-capped at 8s.
t2v_qualityenumqualityfast, qualityText-to-video only. fast trades fidelity for ~2× speed.
num_inference_stepsnumber2510 to 50Diffusion steps. More = higher fidelity, slower.
cfg_scalenumber51 to 10Classifier-free guidance. Higher = follows prompt more strictly.
sigma_shiftnumber51 to 10Schedule shift. Only valid when resolution=360p.
imagestring--Reference image URL for i2v mode.
negative_promptstring--What to avoid.
seednumber--Reproducibility seed.

Good to know

32B-parameter MoE with synchronized lip-sync video + audio in a single inference.

Constraints

  • Generation can take 20+ minutes
  • Image-to-Video typically yields superior results to text-to-video
  • Only 1 image supported (used as the first frame)
  • Video inputs NOT supported

Image formats

  • jpg, jpeg, png, webp, heic, heif, bmp, tiff, tif, gif

MOSS Video and Audio API: common questions

How much does the MOSS Video and Audio API cost?

On EmpirioLabs, MOSS Video and Audio is billed pay as you go: 360p Video $0.17 per video; 720p Video $2.82 per video; T2V Fast $0.065 additional fee. The live rate card on this page always matches what the API charges.

Which endpoint does MOSS Video and Audio use?

MOSS Video and Audio is served through POST /v1/videos/generations on api.empiriolabs.ai with standard bearer-token authentication.

Can I try MOSS Video and Audio in the browser before integrating?

Yes. The EmpirioLabs playground runs MOSS Video and Audio in the browser with the same parameters the API exposes, so you can test prompts before writing code.

How do I get a MOSS Video and Audio API key?

Create an EmpirioLabs account, then generate a key under API Keys in the dashboard. Billing is pay-as-you-go credits, so you only pay for the requests you make.

Ready to use better endpoints?

Explore our models, or contact us about business inquiries, custom deployments, or anything else.