Philip & Dico
DQ Collaboration
How a single paper became the backbone of a sovereign AI ecosystem — 4,687 routing decisions, 8/8 benchmarks passed, and a system that teaches itself.
How a single paper became the backbone of a sovereign AI ecosystem — 4,687 routing decisions, 8/8 benchmarks passed, and a system that teaches itself.
We implemented your formula exactly as published, then extended it with cognitive awareness, expertise routing, and self-improving correctness.
DQ = Validity(0.4) + Specificity(0.3) + Correctness(0.3)
| Component | Weight | What It Measures | Our Enhancement |
|---|---|---|---|
| Validity | 40% | Does model selection make logical sense? | Over-provisioning penalty (Opus for simple = 0.6), under-provisioning 2x penalty |
| Specificity | 30% | How precise is the model match? | Cost-aware tie-breaking when DQ within 0.05 |
| Correctness | 30% | Historical accuracy for similar queries | Token-similarity matching across 4,687 decisions (learning loop) |
| Signal | Weight | Examples |
|---|---|---|
| Architecture | 0.25 | design, system, distributed, scalable |
| Multi-File | 0.20 | across all files, project-wide, refactor all |
| Code | 0.15 | function, class, async, import |
| Analysis | 0.15 | analyze, review, audit, research |
| Debug | 0.10 | error, fix, bug, not working |
| Creation | 0.10 | create, build, implement, generate |
| Simple | -0.15 | what is, how to, explain (reduces complexity) |
Controlled replication of your methodology using real queries from 11 projects across the Antigravity ecosystem.
The harder the query, the bigger the multi-agent advantage — exactly as the paper predicts.
Consensus: median V/S/C across agents. Model selection: majority vote. 74% unanimous agreement.
4,687 routing decisions over 50 days. The correctness component creates a compounding learning loop — DQ improved 51% from baseline.
| Model | Decisions | Share | Avg DQ | Role |
|---|---|---|---|---|
| Opus | 2,248 | 48% | 0.687 | Complex reasoning, architecture, research |
| Haiku | 1,479 | 32% | 0.658 | Quick lookups, simple tasks, formatting |
| Sonnet | 949 | 20% | 0.659 | Code generation, analysis, balanced work |
DQ scoring is Layer 2 of a 7-layer routing intelligence system. Your paper was the foundation — here's what we built on top.
| Rank | Paper | References | Topic |
|---|---|---|---|
| #1 | 2511.15755 | 599 | DQ Scoring (Drammeh) |
| #2 | 2508.17536 | 413 | Voting vs Debate |
| #3 | 2505.19591 | 301 | CIR3 / Evolving Orchestration |
| #4 | 2505.13516 | 255 | Multi-Agent Consensus |
| #5 | 2506.14496 | 200 | LLM-Powered Swarms |
Your framework + our production data = an opportunity to define the standard for AI routing quality.
Academic Theory + Production Data = Industry Standard
Your paper changed how we build AI systems. 599 references, 4,687 decisions, 8 projects. Now let's make it the industry standard.