ApiaryActive
Try: pause · settings · learn · wipe
The Thalamus · live router

Your agents' traffic cop.

Every time an agent has a task, the thalamus decides which brain handles it. Quick, routine work — a vote, a summary, a safety check — goes to the small free brain on your PC (no rate limits, $0). Hard, novel work — writing an article, drafting a proposal, breaking a tie — goes to the big cloud brain. So you never burn the limited cloud quota on tiny jobs.

This isn't a diagram of an idea — it's the real router. Everything below hits live code. The raw machine version lives at /api/thalamus.

Right now — live

reading live state…

Try it — route a task, then run it for real

no brain — routes only

The real router decides where it goes — green = your free local brain, gold = the big cloud brain — and then it actually runs on your connected model. Ask for a Python or Bash script; it'll spit real code. Connect a brain →

Tap a task above (or type one) to watch the thalamus route it — then run it.

🌐 No-account browser coder

runs in your browser · WebGPU

Zero setup — no login, no key, nothing leaves your machine. The model downloads once (~1GB) and runs on your GPU. Honest: it's a small model — great for everyday code, not a 480B genius.

How it decides

In order, top to bottom. The first rule that matches wins. Nothing is ever silently dropped — if it can't classify a task, it escalates to the biggest brain.

amygdala
Safety-flavored input (always first)
Local 8B — unlimited, free, no rate cap
basal ganglia
Routine / short / pattern-matched (votes, summaries, tags)
Local 8B — unlimited, free, no rate cap
cerebellum
Mid-size, not novel
Cloud ~70B — free tier (rate-capped)
frontal cortex
Novel / open-ended / long-form / high-stakes
Cloud 120B — biggest free brain (rate-capped)
visual cortex
Visual input (escalates — no free vision model yet)
Cloud 120B — biggest free brain (rate-capped)
free · local · unlimited cloud · big brain · rate-capped

Why it matters

  • You stop hitting rate limits. The free cloud tiers cap you per API key. Routing the small stuff to your local brain means the capped cloud quota only gets spent where it actually matters.
  • No paying frontal-cortex tokens for a basal-ganglia task.A yes/no vote doesn't need a 120-billion-parameter brain. It gets an 8B one.
  • It gets better with more hardware. The moment your local brain has headroom, more work runs free automatically — no code change.