Fugu Ultra vs GLM 5.2: Coding Compared (2026)

Side by side coding test: Fugu Ultra and GLM 5.2 each rendering a plasma effect from a single HTML file, on EmpirioLabs.

Jun 24, 2026

EmpirioLabs AI

We gave two frontier models the exact same five coding prompts and recorded what each one built. No edits, no retries, no cherry picking. Fugu Ultra from Sakana AI and GLM 5.2 from Z.ai each wrote a self-playing Asteroids, a self-playing Pong, a plasma field, a wormhole tunnel, and a hyperspace starfield, every one a single self-contained HTML file with no libraries. Both models run on EmpirioLabs behind one OpenAI compatible API, so this was one request body with the model name swapped.

Watch all five tests

How we ran it

Each prompt went to each model as one user message, one shot, and we rendered exactly what came back with no edits. Reasoning effort was set to max for both. Fugu Ultra runs its thinking always on, and GLM 5.2 ran at its highest reasoning effort. No temperature override and no system prompt. Maximum output was 32000 tokens. Every prompt asked for a single self-contained HTML file with all CSS and JavaScript inline, no external libraries, no CDN links, and no imports.

The results

Both models returned working code on all five prompts on the first try. Here is the size of each answer, measured in lines of the final HTML file.

Test	Fugu Ultra	GLM 5.2
Self-playing Asteroids	948 lines	656 lines
Self-playing Pong	486 lines	412 lines
Plasma field	298 lines	131 lines
Wormhole tunnel	255 lines	199 lines
Hyperspace starfield	241 lines	166 lines

What we noticed

The two models work very differently under the hood, and the test shows it. Fugu Ultra is a multi-agent orchestration model: it runs several internal reasoning passes before it answers, so it spent far longer per task and produced much more reasoning along the way. It also wrote more lines of code on every prompt. GLM 5.2 is a fast single-pass model with a 1M token context window, and it returned tighter files in a fraction of the time. Neither approach is the winner here. They are built for different jobs, and the right pick depends on whether you want maximum depth per request or speed and volume.

We are not naming a winner on purpose. Watch the clip, see how each render looks and behaves, and judge for your own use case.

Run the same test yourself

Both models serve the OpenAI compatible Chat Completions API, so switching between them is a one line change. Point base_url at https://api.empiriolabs.ai/v1 and set the model id to fugu-ultra or glm-5-2.

curl https://api.empiriolabs.ai/v1/chat/completions \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fugu-ultra",
    "messages": [{"role": "user", "content": "Build a self-playing Asteroids game as a single HTML file, no libraries."}]
  }'

Change "model": "fugu-ultra" to "model": "glm-5-2" and run it again. That is the whole point of EmpirioLabs: every frontier model behind one API, so you can compare them on your own prompts without rewiring anything. You can also run both side by side in the playground.

Frequently asked questions

Which models were tested?

Fugu Ultra from Sakana AI and GLM 5.2 from Z.ai, both available on EmpirioLabs through one OpenAI compatible API.

What were the five coding tasks?

A self-playing Asteroids game, a self-playing Pong game, a demoscene plasma effect, an infinite wormhole tunnel, and a hyperspace starfield warp. Each had to be a single self-contained HTML file with no external libraries.

Was anything edited or retried?

No. Each model got one shot per prompt and we rendered exactly what it returned. We kept the result whether it looked great or not.

Why does Fugu Ultra take longer?

Fugu Ultra is a multi-agent orchestration model with always-on reasoning. It runs multiple internal passes before answering, which trades speed for depth. GLM 5.2 answers in a single pass.

How do I switch between the two models?

Change one string. Both serve the OpenAI Chat Completions API at https://api.empiriolabs.ai/v1, so you set the model id to fugu-ultra or glm-5-2 and everything else stays the same.

Try it

Open the playground | Fugu Ultra model page | GLM 5.2 model page | Pricing

Fugu Ultra vs GLM 5.2: Five Coding Tests Compared

Watch all five tests

How we ran it

The results

What we noticed

Run the same test yourself