HappyHorse 1.1 API: Text, Image and Reference to Video

Jun 22, 2026

EmpirioLabs AI

HappyHorse 1.1 is a video generation model from Alibaba, now live on EmpirioLabs. It is one model with three modes, picked automatically from what you send:

Text to video: send a prompt only.
Image to video: attach one image to animate it.
Reference to video: attach up to 9 reference images and refer to them in the prompt as character1, character2, and so on.

Every mode renders at 720p or 1080p for 3 to 15 seconds, with synchronized native audio. HappyHorse 1.1 improves visual quality, motion smoothness, character consistency across clips, and text rendering over the previous version.

Pricing

HappyHorse 1.1 is pay-as-you-go, billed per second of generated video, with a lower rate at 720p and a higher rate at 1080p. There is no subscription. See the live per-second rates on the HappyHorse 1.1 model page and the pricing page.

Quickstart

Video generation is asynchronous. Submit a request to get a job id, then poll until the video is ready.

curl -X POST https://api.empiriolabs.ai/v1/videos/generations \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "happyhorse-1-1",
    "prompt": "A red fox trotting across a snowy field at sunrise, cinematic",
    "resolution": "720p",
    "duration": 5,
    "aspect_ratio": "16:9"
  }'

The response includes a job id and a poll url. Poll it until the status is completed and a video url is returned:

curl https://api.empiriolabs.ai/v1/jobs/JOB_ID \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY"

For image to video, add an image. For reference to video, pass several images and name them in the prompt:

{
  "model": "happyhorse-1-1",
  "prompt": "character1 walks through character2 at golden hour, cinematic",
  "image": ["https://example.com/portrait.jpg", "https://example.com/scene.jpg"],
  "resolution": "1080p",
  "duration": 5
}

Good to know

One model, modes auto-detected. Send text for text to video, one image for image to video, or several images for reference to video. You can also set the mode parameter explicitly.
Reference to video supports up to 9 images. Name them in the prompt as character1, character2, and so on for the strongest consistency.
Image to video follows the source image for aspect ratio. Text and reference modes accept 16:9, 9:16, 1:1, 4:3, and 3:4.
Native audio is always generated. No separate audio step is needed.
Image input only. HappyHorse 1.1 does not take video input.

Try it now in the playground or read the full API reference.

How to Use the HappyHorse 1.1 API

Pricing

Quickstart

Good to know

Ready to use better endpoints?

How to Use the HappyHorse 1.1 API

Pricing

Quickstart

Good to know

Your Next Articles

How to Call an AI Video Composer API

Seedance 2.5: What to Know Before the Release

Seedance 2.0 Mini: The Fast, Low-Cost Video API

Ready to use better endpoints?