[BUILD] dynamic workflows

In partnership with

When you try to hand a large task to Claude Code, what breaks first?

Jarred Sumner ported Bun from Zig to Rust last week.

960,000 lines of code, ported in six days, with 99.8% of the test suite still passing on the other side.

He didn't type most of it.

He wrote a spec, handed Claude the task, and let it run hundreds of agents in parallel, with two more agents assigned to argue against every file before it got committed.

That prompt pattern is what Anthropic shipped publicly today, packaged as Dynamic Workflows.

The technique itself has been doable for months, as long as you were willing to script the orchestration by hand.

As of today it's a toggle in the effort menu.

DevTools of the week

1. Vibedock: A macOS menu bar app that lets you toggle Claude Code MCP servers on and off per project without touching a config file. Every active MCP eats context whether you use it or not, so switching off the ones a task doesn't need trims your token bill on every message.

2. Integuru: Turns a website into a usable API by reverse-engineering the HTTP calls behind it, so you can integrate with platforms that never shipped a public one. You give it a URL, describe what you need in plain English, and it generates the requests and handles the auth, cookies and 2FA included, with no browser automation in the loop.

3. Archi-Flow: A cloud architecture diagramming tool that runs a live traffic simulation through whatever you draw, showing latency, error rates, and which nodes tip into overload as load climbs. It turns the usual static box-and-arrow diagram into something you can stress-test, so you see where a design buckles before you build it.

The Spec Behind The Bun Rewrite

Before any of the agents ran, Jarred wrote a 600-line document.

It mapped every Zig type and idiom in the codebase to its Rust equivalent, with the rules for how each one should translate.

That document was the spec, and Claude couldn't port what it didn't understand.

The workflow ran exactly what the spec laid out, just across the whole codebase at once.

Most people will see "Dynamic Workflows" in the effort menu, type a vague task, and wonder why the output came back a mess.

The workflow is only ever as sharp as the spec feeding it.

The pattern, step by step:

What Changed Under The Hood

Every Claude Code session until today ran on one agent and one context window.

Each tool call flows its result back into that same context, and the agent uses everything it's accumulated to decide the next move.

That works fine for a single file and starts to wobble the moment you're touching a whole feature.

The context window is the bottleneck.

Every intermediate result adds weight, so by file 40 of a 200-file job Claude is dragging 39 files' worth of context into every decision.

The work gets slower and sloppier the deeper it goes, and the token bill climbs right along with it.

Dynamic Workflows pulls the plan out of the context window and into code.

Claude writes a JavaScript orchestration script that decides what to spawn and in what order.

Each subagent runs in its own fresh context, and the intermediate state lives in script variables.

A 200-file job that used to run as 200 sequential steps now runs as 200 agents at the same time, with the orchestrator gathering what they return.

How To Turn It On

Two ways in.

Manual.

Put the word "workflow" anywhere in your prompt, or just say "create a workflow to…".

Claude plans it, shows you the script, and waits for you to confirm.

Use this when you have one specific large task and want eyes on the plan before it commits to anything.

Automatic.

The ultracode command sets effort to xhigh and lets Claude decide on its own when a task is big enough to deserve a workflow.

Good for open-ended heavy sessions where most of what you're doing is substantial anyway.

If you're watching cost, stay on manual.

Leave ultracode running all day and Claude will start spinning up workflows for tasks that never needed one.

It's on by default on Max and Team, while Enterprise keeps it off until an admin turns it on.

When to reach for it

It earns the cost on codebase-wide audits and security sweeps, where you want every finding checked independently before it reaches the report.

Same for a big migration with both ends well defined, like a framework swap or a language port.

The other case is anything critical enough that you'd want two separate cracks at it, with the agents tearing into each other's output while you keep only what holds up.

Skip it for anything a single focused agent can finish cleanly, or any task where you want to approve each step as it goes.

And don't trigger one before the spec exists.

That's how you burn eighty dollars of tokens on a plan you're going to throw away.

What It Costs

Anthropic flagged this directly in the launch post: a workflow burns a lot more tokens than a normal session.

A complex Claude Code session runs maybe 100K to 300K tokens.

A workflow with 100 subagents going for a few hours can run 10 to 50 times that.

At Opus 4.8 rates of $5 in and $25 out per million, a single heavy run adds up fast.

The Limits

It's a research preview, so the behavior isn't settled.

The orchestration logic and the way it spends tokens can both shift before it reaches GA, which is reason enough to keep it away from production automation for now.

One thing buried in the system card: Opus 4.8's resistance to prompt injection slipped a little compared to 4.7.

If your workflow is reading untrusted input or pulling in external data, write the guardrails into the spec and don't assume the model has it covered.

And the failure mode is quieter than you're used to.

A normal session goes wrong fast and loud, so you catch it and move on.

A workflow can be 50 files deep before anything looks off.

The plan preview is the only early warning you get, which is the whole reason to read it.

My Take

Everyone's going to credit the spec but the test suite is what carried it.

Jarred's document told Claude how to translate Zig into Rust, and the tests caught it every time it slipped, across all 960,000 lines, while he barely read the output.

Without that suite he'd have a clean-looking port full of breakage he couldn't find.

So who gets to do this comes down to who already has the coverage.

A spec you can write today, because test coverage takes years, and you either started building it or you didn't.

The teams about to get serious leverage on their old code already wrote those tests, back when it just looked like discipline.

Everyone else has a great model and nothing safe to point it at.

You picked which group you're in years ago, by whether you bothered writing tests.

Until next time,
Vaibhav 🤝🏻

If you read till here, you might find this interesting

#AD 1

The best prompt engineers aren't typing. They're talking.

Power users figured this out early: speaking a prompt gives you 10x more context in half the time. You include the edge cases, the examples, the tone you want — because talking is fast enough that you don't skip them.

Wispr Flow captures everything you say and turns it into clean, structured text for any AI tool. Speak messy. Get polished input. Paste into ChatGPT, Claude, Cursor, or wherever you work.

89% of messages sent with zero edits. 4x faster than typing. Works system-wide on Mac, Windows, and iPhone.

Start flowing free

#AD 2

Moda is the viral AI design agent for polished, on-brand slides, docs, ads, and more. Turn prompts into fully editable designs on a real canvas, then export to PowerPoint, PDF, and more.

Try Moda Free Today