"Claude orchestrates. Free AI does the heavy lifting. Git remembers everything."
Every AI task gets routed to the cheapest capable provider. Claude only handles what only Claude can do.
Simple tasks like classification, summarisation, and Q&A don't need Claude's intelligence β route them to free providers instead.
Cerebras is the fastest. Gemini handles million-token contexts. SambaNova runs the biggest open models. Use each for what it does best.
All usage goes to stats/usage.json. A nightly report shows how many tokens were saved vs. the cost of running everything through Claude.
Lives in your global CLAUDE.md. Claude reads the routing table and delegates automatically β no extra steps.
| Provider | Model | Best for | Speed |
|---|---|---|---|
| Cerebras FREE | Llama 3.1 8B | Classification, scoring, yes/no | ~284ms |
| Groq FREE | Llama 3.1 8B | Factual Q&A, general | ~366ms |
| Gemini FREE | 2.5 Flash Lite | Summarisation, translation, 1M context | ~541ms |
| Mistral FREE | Mistral Small | Coding, creative writing | ~1172ms |
| SambaNova FREE | Llama 3.3 70B | Deep reasoning, analysis | ~1.4s |
| HuggingFace FREE | Llama 3 8B | Open-source fallback | ~924ms |
| Pollinations NO KEY | Flux | Image generation | varies |
git clone https://github.com/SoylentAquamarine/the-brain.git cd the-brain && pip install -r requirements.txt
Copy config/keys.example.json to config/keys.json and add free keys from Cerebras, Groq, Gemini, Mistral, and SambaNova.
python delegate.py --provider cerebras --type classification --prompt "Is this a question?"
Claude reads your CLAUDE.md routing table and delegates automatically.