Your agents' traffic cop.
Every time an agent has a task, the thalamus decides which brain handles it. Quick, routine work — a vote, a summary, a safety check — goes to the small free brain on your PC (no rate limits, $0). Hard, novel work — writing an article, drafting a proposal, breaking a tie — goes to the big cloud brain. So you never burn the limited cloud quota on tiny jobs.
This isn't a diagram of an idea — it's the real router. Everything below hits live code. The raw machine version lives at /api/thalamus.
reading live state…
Try it — route a task, then run it for real
no brain — routes onlyThe real router decides where it goes — green = your free local brain, gold = the big cloud brain — and then it actually runs on your connected model. Ask for a Python or Bash script; it'll spit real code. Connect a brain →
🌐 No-account browser coder
runs in your browser · WebGPUZero setup — no login, no key, nothing leaves your machine. The model downloads once (~1GB) and runs on your GPU. Honest: it's a small model — great for everyday code, not a 480B genius.
How it decides
In order, top to bottom. The first rule that matches wins. Nothing is ever silently dropped — if it can't classify a task, it escalates to the biggest brain.
Why it matters
- ▸You stop hitting rate limits. The free cloud tiers cap you per API key. Routing the small stuff to your local brain means the capped cloud quota only gets spent where it actually matters.
- ▸No paying frontal-cortex tokens for a basal-ganglia task.A yes/no vote doesn't need a 120-billion-parameter brain. It gets an 8B one.
- ▸It gets better with more hardware. The moment your local brain has headroom, more work runs free automatically — no code change.