Helmut · Tausend Euro

How it started

It wasn't even a rap project.

It started as a Schlager idea: "a German song about today's news." A tooling-research chat turned into a genre pivot, a casting round, and finally a whole pipeline.

Spark · Research

"A German Schlager about today's news"

The first question was pure tooling research: which AI tools for text → music → video? Answer stack: news & lyrics right in the chat, Suno for the song, Veo/Kling/Seedance for video, FFmpeg for the cut.

"I'm thinking about creating a german schlager song about today's news with ai tools, potentially including a video."

Subject · The News

The collapsed €1000 energy rebate

Picked from the day's headlines: the Bundesrat blocks the promised tax-free €1000 rebate. A perfect Schlager arc — anticipation → disappointment → we dance anyway. "Everyone has an electricity bill."

⚡ The Pivot · Genre Switch

Schlager → late-90s Deutschrap

The decisive creative leap. First Schlager-pop, then the voice trimmed to a "gentle-giant baritone" — then the full genre pivot to Fanta-4-/Fettes-Brot-/Beginner boom-bap, 90 BPM, rapped verses with a sung hook. Lyrics and Suno prompt rewritten from scratch.

"and now as a german rap style of late 90s early 2000s"

Casting · The Character

A Yeti named Helmut

Four mascots evaluated. The Yeti wins for two reasons: the cold metaphor (a Yeti freezing because of his electricity bill = peak Schlager self-irony) and the AI trick — white fur on any background = high contrast = models keep him consistent. The name "Helmut": a warm German everyman-uncle name that carries both framings — Schlager Heimat vibe and Deutschrap blue-collar.

Trick · Continuity

The edelweiss medallion

The genre pivot swapped the wardrobe entirely: Trachtenjanker → oversized burgundy hoodie + DJ headphones. To keep it the same Helmut, a tiny edelweiss medallion travels along as a signature — "still Helmut, just remixed."

Architecture · Pipeline

Driven from Claude Code

Final research step: how do you drive this from Claude Code? Result — an MCP gateway for the models, scenes.json as single source of truth, scripts for generate / lip-sync / compose. Exactly that structure is in the repo today.

CAST

🧊

Helmut the Yeti

Cold metaphor + consistency hack. Instantly adoptable.

🧙

Garden gnome "Gernot"

Maximally German — but proportions tricky for AI video.

🐻‍❄️

Polar bear "Eberhard"

Same cold logic, but the Knut trope is used up.

🌭

Caretaker dachshund

Very German — dogs drift harder in AI than blob creatures.

Creative DNA

What fed what

Topical, musical, linguistic — the influences behind every line.

📰 TOPICAL · HEADLINES FROM MAY 12, 2026

HEADLINEBundesrat blocks the tax-free €1000 energy rebate; coalition committee meets that evening

→

central narrative arc · "die Tausend bleiben Eis"

HEADLINEMerz's reform agenda stuck in federal gridlock

→

"Friedrich Merz hat 'nen Plan, doch der Plan hat 'nen Plan"

HEADLINESPD / CSU / CDU committee squabbling

→

"reden, reden, reden — doch am Ende kein Beschluss"

REJECTEDCybercrime / AI driver · doctors' congress protests

weaker Schlager arc — no "dance anyway"

🎧 MUSICAL FAMILY TREE

SCHLAGER · DRAFT 1 · REJECTED

Helene Fischer · pop polish Mickie Krause · beer-tent energy Andreas Gabalier · folk-rock baritone Hansi Hinterseer · alpine warmth

⚡ GENRE PIVOT ⚡

DEUTSCHRAP · DEVELOPED

★ Fanta 4 — "MfG" · bureaucracy-satire template ★ Fettes Brot — "Jein" · narrative + melodic hook Beginner · Hamburg boom-bap, Rhodes, upright bass Freundeskreis — "ANNA" · jazzy bridge Eins Zwo · dense internal rhyme Samy Deluxe · sharper sound option Smudo / Thomas D · König Boris · vocal character

Suno filters real artist names inconsistently → the prompt translates them into sonic descriptors: "jazzy boom-bap, 90 BPM dusty drums, mellow Rhodes, warm upright bass, vinyl crackle, scratches."

🗣️ LANGUAGE FORENSICS

„Pustekuchen"

Peak boomer brush-off ("no chance") — fits the bureaucracy mockery.

„Tinnitus"

Ringing, then gone — abstract Eins-Zwo/Dendemann wordplay.

„Digga / Bruder"

Hamburg/Berlin rap-era slang — authenticity markers.

Slang × Beamtendeutsch

"Pustekuchen" + "Föderalismus" (bureaucratese) in one verse = a Fanta-4/Fettes-Brot signature.

The Pipeline

From news ticker to music video

Every step driven from Claude Code — Atlas Cloud as the primary model gateway, fal.ai for lip-sync.

📝

Concept & Lyrics

Story, rhyme, timing mapped to 90 BPM.

creative/lyrics.lrc

🎵

Suno Track

Deutschrap beat, 3:53, hand-curated.

tausend_euro.mp3

🧊

Helmut Reference

8 variants + 3 angles locked.

nano-banana 2

🎞️

21 Scenes

Ref-to-video, 1 take each.

Seedance 2.0 · Atlas

👄

Lip-Sync (trialed)

Experimentally tested only — not in the final cut.

Hedra · fal.ai

🎚️

FFmpeg Cut

On the beat, VHS grade, 1080p.

preview.mp4

The Lab

Most of it was experiment

The concept doc was written before we touched the API. Almost every assumption was wrong on first contact. A field log:

LOG 01 · MODEL BAKE-OFF: VEO → KLING → SEEDANCE

Three models tried. Veo 3.1 drifted (face went humanoid) and actually cost $0.20/s instead of the assumed $0.03 — abandoned. Kling o3 Pro held the character (16 clips, 3–4 takes/scene) but pricey. Seedance 2.0 became the production model: the only one that renders legible German text on CRT screens, ID cards and bills.

Kling o3 Pro · v1 batch

Good consistency, $12.16 for 16 clips, 3–4 takes per scene — dropped in favor of Seedance.

Seedance 2.0 · production

Best German text rendering, one take per scene, 21 scenes final.

LOG 02 · CONCEPT vs. REALITY

✗ ASSUMPTION

Atlas catalog lives at /v1/models

✓ REALITY

That's only the OpenAI-compatible text route (105 models). The real catalog (313 models: Veo, Kling, Seedance…) is at /api/v1/models.

✗ ASSUMPTION

3:00 minutes, timings from the concept

✓ REALITY

Suno delivered 3:53. All scenes re-timed against the real LRC boundaries — v1 (scene_01–09) → v2 (scene_a01–a21).

✗ ASSUMPTION

One reference is enough for consistency

✓ REALITY

With just 1 ref the face drifts humanoid/baboon-ish. Fix: 1 anchor + 3 angles (face/profile/back) + compact bible + hard negatives.

✗ ASSUMPTION

Catalog prices are accurate

✓ REALITY

Seedance bills token-metered ≈ 2.17× the catalog rate. Veo $0.20/s instead of $0.03. Budget raised after the lesson.

LOG 03 · LIP-SYNC ATTEMPTS

No Atlas model auto-syncs to a provided audio track. So external trials — MuseTalk vs. sync v3 vs. Hedra Character-3 (fal.ai) — plus custom audio analysis: MFCC template, voice and timbre charts to verify alignment. Outcome: never produced. The final cut uses Seedance's native rap-cadence mouth motion — true lip-sync stayed experimental.

voice chart — hook2 · vocal-onset analysis for cut points

timbre chart — hook2 · timbre analysis (Helmut template)

lipsync overlay — scene_a14 · lip-sync alignment overlay

Real sync outputs from the trials — one shows the classic MuseTalk problem (artifacts around the mouth & fur on stylized faces); the v3-pass close-ups hold up usable. Sound on for the sync impression:

MuseTalk · scene_a08 — artifacts, dropped

v3 sync · Helmut close-up — usable

v3 sync · Hook-2 close-up — usable

LOG 04 · HELGA — THE SECOND VOICE

The hook has a female harmony behind Helmut's lead. So the screen wouldn't show Helmut miming someone else's voice, Helga was added — a second Yeti character (red beanie, corduroy bomber, the same edelweiss medallion as a continuity anchor). But the moment two characters sing in one scene, you get two simultaneous lip-sync targets plus voice attribution — too error-prone with single-character sync already unsolved.

Helga · the second voice

Own character for the hook harmony. nano-banana, 6 variants + outfit tests.

Two characters, one scene

Double lip-sync + "who sings which line" — unstable without very short cuts & detailed planning.

Decision: Helga removed for now — good enough for now, final cut is Helmut only. Way forward: more thorough planning & shorter cuts — or simply different shots where Helmut is not on screen during the second-voice lines (environment / B-roll), which removes the two-character problem entirely.

⚠ WAR STORY · GHOST SPEND

How a 900-second timeout burned ~$30

Seedance 2.0 occasionally takes 15–25 minutes for a 13–15s clip. The original poller gave up after 900s (15 min); the retry loop submitted a new generation (~$3) — up to 4×. Server-side, the original prediction kept rendering and would have finished. Atlas has no cancel endpoint. Effect: ~$15 burned per scene-that-would-have-worked-anyway. Hit A13 & A14.

→ Fix: timeout raised to 1800s (30 min). Waiting is the only cancel option.

The Tools

The Models — and our verdict

Which AI model did what, what it cost, and why it won or got cut. External figures as of May 2026; "verdict" = what actually happened in this project.

🎬 VIDEO

Veo 3.1DeepMind

Best prompt adherence — but Helmut's face drifted humanoid, and the priciest. Tested, dropped.

~$0.40/s standard · 8-s clips · still "Preview"

Kling 3.0 ProKuaishou

Held the character well — the v1 path (16 clips, $12.16). Too costly at volume → replaced.

~$0.095/generation (via Atlas) · strong consistency

Seedance 2.0ByteDance

★ PRODUCTION

All 21 final scenes. The only model rendering legible German text on bills & CRTs.

~$0.14/s · bills ≈ 2.17× catalog · 15–25-min renders → ghost-spend

🖼️ IMAGE

Nano Banana 2DeepMind · Gemini 3.1 Flash Image

★ REFERENCES

All Helmut/Helga refs + cover, thumbnail, OG (CLI). Best legible in-image text.

~$0.067/image · refuses real-person photos → blocked the ESC composite

Imagen 4 UltraDeepMind

Text-to-image only, no reference input → couldn't lock Helmut, produced a generic yeti.

$0.06/image · max 2K · wrong tool for consistency

Seedream v4.5ByteDance

Multi-ref edit (up to 10): for the ESC special it took the real photo + Helmut where Gemini refused.

~$0.03–0.036/image · strong identity preservation

Grok ImaginexAI

Also multi-ref with the real photo — gave the closest ESC costume silhouette of any editor.

$0.02/image std · $0.07 pro

👄 LIP-SYNC

Hedra Character-3Hedra

Benchmark for stylized faces — the reason fal.ai stays installed (Atlas doesn't host it). Trialed, never shipped.

credit sub ~$10/mo+ · per-second via fal.ai

MuseTalk 1.5Tencent · OSS

Artifacts around mouth & fur on Helmut's face — the "Tiefpunkt" in the making-of. Rejected.

free (self-host) · 256×256 mouth region

sync.so v3sync. labs

The "sync v3" trials — the a14 close-ups were "usable", the best of the lip-sync attempts. Not final.

~$0.04–0.13/s by model · sync-3 = 4K

🎵 AUDIO

Suno v5.5Suno, Inc.

★ THE SONG

Made the track. Drove the Schlager→Deutschrap pivot + the 3 versions. No official API → manual.

v5.5 · 26 Mar 2026 · consumer subscription

ElevenLabs v3ElevenLabs

★ NARRATOR

The making-of narrator ("George", locked voice_id) with per-line emotion tags. Free tier blocks cloning → premade voice.

v3 GA 14 Mar 2026 · Starter $5/mo unlocks cloning

🛰️ GATEWAYS

Atlas Cloudprimary gateway

~95% of all generations. Gotchas: the real catalog is /api/v1/models; no cancel endpoint → ghost-spend; catalog prices are lower bounds.

300+ models · ~30–54% cheaper than fal.ai (vendor-stated)

fal.aisecondary

Kept only for Hedra lip-sync — everything else routes through Atlas (cheaper).

1000+ models · per-output or GPU-second

Sources & detail: MODELS.md in the repo. Empirical figures (≈2.17×, $12.16, …) from cost_log.csv.

Helmut & the
Tausend Euro

It wasn't even a rap project.

"A German Schlager about today's news"

The collapsed €1000 energy rebate

Schlager → late-90s Deutschrap

A Yeti named Helmut

The edelweiss medallion

Driven from Claude Code

Helmut the Yeti

Garden gnome "Gernot"

Polar bear "Eberhard"

Caretaker dachshund

What fed what

Hear the pivot — live

"This morning,
the power bill came."

From news ticker to music video

Concept & Lyrics

Suno Track

Helmut Reference

21 Scenes

Lip-Sync (trialed)

FFmpeg Cut

Helmut — the Yeti MC

21 scenes, one pass

Most of it was experiment

How a 900-second timeout burned ~$30

The Models — and our verdict

What does an AI music video cost?

It wasn't even a rap project.

"A German Schlager about today's news"

The collapsed €1000 energy rebate

Schlager → late-90s Deutschrap

A Yeti named Helmut

The edelweiss medallion

Driven from Claude Code

Helmut the Yeti

Garden gnome "Gernot"

Polar bear "Eberhard"

Caretaker dachshund

What fed what

Hear the pivot — live

"This morning,the power bill came."

From news ticker to music video

Concept & Lyrics

Suno Track

Helmut Reference

21 Scenes

Lip-Sync (trialed)

FFmpeg Cut

Helmut — the Yeti MC

21 scenes, one pass

Most of it was experiment

How a 900-second timeout burned ~$30

The Models — and our verdict

What does an AI music video cost?

"This morning,
the power bill came."