open-source fugu

MAESTROone endpoint · every model · cheap-first · full cost shown

request lifecycle
REQUEST
any client
CLASSIFY
task · difficulty
ROUTE
cheap → strong
VERIFY
accept / escalate
ANSWER
+ route + cost
cheap model first → VERIFY → escalate to frontier only if needed

the point

Same quality. A fraction of the cost.

frontier-only
$0.0151 / task
maestro
$0.0005 / task
97%
cheaper · 92% of best quality
26×
answers per $
+4pt
quality vs random
2
APIs: OpenAI + Anthropic
0
GPUs required

the loop

It climbs only as high as it must.

CHEAP
glm-4.7-flash · qwen3.5
tries first
MID
deepseek-v4 · kimi-k2.7
if verifier says REVISE
FRONTIER
opus 4.8 · gpt-5.5 · gemini 3.1
ACCEPT → stop

the pool

Any model. Open and closed. Yours.

One OpenRouter key, or your own local models. Swap them in a JSON file. No retraining.

GLM-4.7 cheapQWEN3.5 cheapDEEPSEEK-V4 midKIMI-K2.7 midOPUS 4.8 frontierGPT-5.5 frontierGEMINI 3.1 frontier+ LOCAL ollama / vllmGLM-4.7 cheapQWEN3.5 cheapDEEPSEEK-V4 midKIMI-K2.7 midOPUS 4.8 frontierGPT-5.5 frontierGEMINI 3.1 frontier+ LOCAL ollama / vllm

no black box

Every answer ships its receipt.

 response.maestro
{
  "route": [
    { "deepseek/deepseek-v4-pro",  "verdict": "REVISE" },
    { "anthropic/claude-opus-4.8", "verdict": "ACCEPT" }
  ],
  "classify": { "task": "code", "difficulty": 0.78 },
  "cost_usd": 0.0182,
  "cost_vs_frontier_only_usd": 0.0241,
  "savings_pct": 24
}

vs the rest

Open. Honest. Runs anywhere.

MAESTROSakana FuguOpenFugu
open source / self-hostYESnoyes
any model (open + closed)YES3 closedresearch
OpenAI + Anthropic APIYESnopartial
per-request cost shownYESnono
no GPU requiredYESn/aneeds GPU
honest benchmarkYESpartialmock
🔌

DROP-IN

Change one base URL. Works in Claude Code, opencode, Cursor, the OpenAI & Vercel AI SDKs.

🛠

YOUR TOOL LOOP

Tools pass straight through. tool_calls returned verbatim. Maestro never runs your tools.

🏠

SELF-HOST

No GPU, no model hosting. Bring one key, or go fully local with Ollama / vLLM.

The open-source Fugu you can run.

Early build. Tested live on real models. MIT.