HighLvl
An AI nutrition coach — designed, built, and shipped to TestFlight by one designer directing a fleet of AI agents in under a week. Now on iOS and Android.
One Saturday HighLvl was a thought. Six days later it was a real iOS app on TestFlight — a polished consumer product, built and shipped by one person. The app is the proof. The how is the story: a single designer directing a fleet of AI agents that never stop working.
Premise.
I've spent my life making weight — powerlifting, combat sports, now jiu-jitsu. Daily nutrition tracking isn't optional when you compete, and it's the highest-leverage habit for anyone. But every tracker out there is either expensive for what it does or too clunky to stick with. I kept thinking about the younger athletes at my gym who can't spare $30 a month for a coach but would benefit most from one.
So I set out to build a genuinely good one — and to use it as a forcing function for a harder question: can one designer take a real, polished consumer app from idea to a shipped iOS beta in a single week? Not a prototype. A product I'd use over the three trackers already on my phone.

Two agents who argue with me.
I run two custom harness agents. Dave, a Hermes agent, handles planning and project management. Shawn, an OpenClaw agent, handles execution. Both are explicitly told to play devil's advocate — every idea I have gets argued out before a single line of code exists. They push back, tell me where I'm wrong, and propose what they'd do instead.
Once we agree, Dave writes the tickets in Linear with clear acceptance criteria — and that criteria ladder straight up to a /goal, skill, or workflow prompt. So by the time a ticket exists, the scaffolded prompt is already attached to it. When I open Claude Code or Codex, I paste it in and go. No cold start, because we already settled what “done” means.
Then we split the work. Shawn takes tickets and executes around the clock. I take the ones I want to build myself.
![A Linear ticket — '[hermes-planned] Macro: Repo scaffolding + SwiftUI shell' — with a Deliverables checklist (GitHub repo, Xcode project, folder structure, Core Data stack, README) above a '/goal prompt for Claude Code' code block that spells out the exact scaffolding steps, so the prompt is attached before any work starts.](/_next/image?url=%2FHighLvl%2Fhighlvl--linear-prompt.png&w=3840&q=75)
I'm the bottleneck for taste and direction — not for hours in the chair.
The rhythm never stops.
I keep multiple Claude Code and Codex instances running on my MacBook around the clock, connected remotely to their apps on my phone. At night, I hand out goal and workflow prompts before bed. In the morning, I review what got built overnight, course-correct, and send everything off again before work. After work, I review and steer once more.
Shawn and Dave manage and execute the project 24/7. I'm the bottleneck for taste and direction — not for hours in the chair. All of it runs on personal hardware, on its own network, nights and weekends only.

Where the human stays.
Design is the part that still doesn't outsource cleanly. Even with Raven — my design-system MCP — doing real heavy lifting, judgment and creativity still come down to a person deciding. My move: have whatever agent I'm working with generate three to nine variants of a screen or component, then I pick and combine the best pieces and push the result into Figma to iterate from there.
And which model matters more than people think. Claude Code on Opus 4.8 is far better at generating something coherent from scratch and spinning up multiple strong versions fast. Codex on GPT-5.5 is far better at taking an exact layout — from Figma, another project, or a web page — and applying it faithfully into a Next.js app. Different tools for different jobs. Knowing which to reach for is most of the speed.

What shipped.
HighLvl is a clean, fast nutrition coach that learns you — now on iOS and Android, both driven by the same multi-model brain. It reads Apple Health, Health Connect, and Whoop, so its advice is grounded in your real day: how you recovered, what you actually burned. And when you log a meal, it doesn't estimate the numbers — it looks them up live against USDA FoodData Central and Open Food Facts, then shows its work, right down to a branded "composing" state that surfaces the real steps (reading your recovery, searching micronutrients, establishing macros) instead of a blank spinner. It learns your food preferences and training schedule, and every morning it hands you a personalized plan card. No watch or Whoop required; more data just makes it sharper.
It ships with three full themes — Humble Minimalist, Pure Execution, and Flashy — because personal style isn't one-size-fits-all. A tracker you open a dozen times a day has to be something you love to look at, not just tolerate. So whether your taste runs calm and bright, dark and all-business, or loud neon that glows in the dark, the entire app re-skins to match. Make it feel like yours and you'll actually keep using it — which is the only way any of this works.
The real product isn't the calorie counter. It's the multi-model, multi-agent orchestration underneath — blending multiple frontier models by intent to deliver genuinely personalized nutrition advice at the lowest possible cost, right on your phone. Entry tier under $5 a month, with a free bring-your-own-AI path. Built, used, and shipped in under a week by one person directing a small fleet of agents.


Design tokens.



What changed.
6 days
From a thought on Saturday to a real iOS app on TestFlight.
1 operator
One designer directing the fleet — taste and direction, not hours in the chair.
3 models
Haiku, Sonnet, and Opus blended by intent for personalized advice at the lowest cost.
< $5/mo
Accessible entry tier, plus a free bring-your-own-AI path.

The useful lesson wasn't about any single model. It was about where a human still has to stand. With a harness like this, the agents are no longer the constraint — taste, direction, and the call on what “good” means are. The bottleneck moved from hours to judgment, which is exactly where a designer should be.
The second thing is that the harness is the IP. Models change every quarter; swapping one out is a single-file change. The two-agent argument, the ticket-to-prompt scaffolding, the model-routing instincts, the nightly rhythm — that's the part that compounds. HighLvl is the first thing it shipped. It won't be the last.
[Next case study]