HappyHorse 1.1 is a video generation model from Alibaba, now live on EmpirioLabs. It is one model with three modes, picked automatically from what you send:
- Text to video: send a prompt only.
- Image to video: attach one image to animate it.
- Reference to video: attach up to 9 reference images and refer to them in the prompt as character1, character2, and so on.
Every mode renders at 720p or 1080p for 3 to 15 seconds, with synchronized native audio. HappyHorse 1.1 improves visual quality, motion smoothness, character consistency across clips, and text rendering over the previous version.
Pricing
HappyHorse 1.1 is pay-as-you-go, billed per second of generated video, with a lower rate at 720p and a higher rate at 1080p. There is no subscription. See the live per-second rates on the HappyHorse 1.1 model page and the pricing page.
Quickstart
Video generation is asynchronous. Submit a request to get a job id, then poll until the video is ready.
curl -X POST https://api.empiriolabs.ai/v1/videos/generations \
-H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "happyhorse-1-1",
"prompt": "A red fox trotting across a snowy field at sunrise, cinematic",
"resolution": "720p",
"duration": 5,
"aspect_ratio": "16:9"
}'
The response includes a job id and a poll url. Poll it until the status is completed and a video url is returned:
curl https://api.empiriolabs.ai/v1/jobs/JOB_ID \
-H "Authorization: Bearer $EMPIRIOLABS_API_KEY"
For image to video, add an image. For reference to video, pass several images and name them in the prompt:
{
"model": "happyhorse-1-1",
"prompt": "character1 walks through character2 at golden hour, cinematic",
"image": ["https://example.com/portrait.jpg", "https://example.com/scene.jpg"],
"resolution": "1080p",
"duration": 5
}
Good to know
- One model, modes auto-detected. Send text for text to video, one image for image to video, or several images for reference to video. You can also set the mode parameter explicitly.
- Reference to video supports up to 9 images. Name them in the prompt as character1, character2, and so on for the strongest consistency.
- Image to video follows the source image for aspect ratio. Text and reference modes accept 16:9, 9:16, 1:1, 4:3, and 3:4.
- Native audio is always generated. No separate audio step is needed.
- Image input only. HappyHorse 1.1 does not take video input.
Try it now in the playground or read the full API reference.



