ArgosBrain · For MCP server developers

Reference MCP memory server.
Rust core. Zero panics.

Building your own MCP server? ArgosBrain is a working reference: stdio transport, JSON-RPC, zero panics in production, 59 tools registered, 15 in the default surface, dual-gate (tier + surface) filtering, in-process router that lets skills bypass the catalog. Study what works; rip out what doesn't.

01Why this matters for MCP builders

The MCP spec is small. Production-grade MCP is hard.

The MCP spec fits in a tweet: stdio JSON-RPC, initialize + tools/list + tools/call, done. Building a server that survives 8KB pipe-buffer deadlocks on macOS, recovers from stdin closes mid-call, doesn't panic on malformed JSON, gracefully degrades when its backing brain is loading, and exposes the right surface to an LLM agent — that's where the engineering lives.

ArgosBrain has shipped 59 production releases through these problems. The code is private but the architectural lessons are public — we publish them in the methodology + architecture papers and reflect them in this site's documentation.

02The 2-card pivot — surface design pattern

15 tools default. ~59 total. Why the gap matters.

HN feedback on similar MCP servers (Semble launch thread, 2026-05-18) was unambiguous: large tool catalogs are counter-productive. Every tool adds 150-250 tokens to the system prompt. 59 tools × 200 ≈ 12k tokens of "here's what I can do" before the user has typed anything.

Agents see the catalog, hit decision paralysis on overlap, default to grep. The fix is a surface filter on tools/list — show only the tools that directly power the products you sell. ArgosBrain ships two products (🛡 Safe Edit Loop + 🔴 Red Team Audit); the 15 default-exposed tools are exactly the primitives those two compose. Hidden tools remain in the in-process router so orchestrator skills can still invoke them by name — the surface filter is on list_tools only, NOT on call_tool. That nuance unlocks skills.

Set ARGOSBRAIN_EXPOSE=full to opt back to all 59. The hidden ones are catalogued under the full pack.

03The hook architecture — Card 1

🛡 PreToolUse + PostToolUse: deterministic enforcement.

MCP instructions field gives prompt-side guidance (~90-95% LLM compliance). For products that need 100%, ArgosBrain adds Claude Code-native hooks:

  • PreToolUse(Edit|Write|MultiEdit) — inline command argosbrain hook preflight reads the hook payload, extracts target symbol via per-language regex, posts to /api/preflight, returns risk + caller count as additionalContext.
  • PostToolUse(Edit|Write|NotebookEdit) — bash hook (argos-fake-done-watcher.sh) scans the modified file for 50+ stub patterns. Hit → injects obligation context. Fail-soft on missing rg.

For non-Claude-Code agents (Cursor / Aider / Cline) — no hook system. argosbrain init instead writes .cursor/rules/argosbrain.mdc + .clinerules + CONVENTIONS.md + CLAUDE.md with the same ARGOS PROTOCOL block. Compliance drops to ~85-95% but the signal is in every chat.

04The orchestrator pattern — Card 2

🔴 One slash command. Nine sub-perspectives. Internally composed.

/argos-security is a single skill registered with Claude Code / Cursor / Codex. Its body instructs the agent to invoke nine sub-perspectives in parallel — Recon, Web/API, Cloud, AI/LLM, Supply-chain, Build & Release, Forgotten Attack Surface, Surface Drift Watch, Privilege Boundary Leaks. Each perspective calls the underlying MCP tools (find_sinks, find_dead_symbols, api_diff, etc.) and emits findings as primitives.

The Chain Composer (Perspective 10) takes the union of primitives, builds a precondition/postcondition lattice, runs single-hop + multi-hop composition, scores by (impact × confidence) / cost-to-exploit, maps to MITRE ATT&CK / Cyber Kill Chain / Unified Kill Chain phases. Output: 3-5 ranked exploit chains with historical-campaign analogies.

The pattern: orchestrator skill = thin agent prompt; deep tools = MCP primitives. Skills compose MCP tools; MCP tools never call skills. Clean unidirectional layering.

05Production resilience

Things you learn the hard way.

  • macOS 8KB pipe-buffer deadlock — every response > 8 KB risks freezing Cursor's stdio transport. Solution: bound every response payload at the tool level (triage_sinks returns summary, drill into details on demand).
  • Brain loading state — at process boot, the bincode brain.bin takes 15-20s to deserialize + HNSW rebuild. Every tool that hits the brain detects brain_loading=true and returns a structured "loading, retry in 20s" message instead of blocking the call.
  • LameDuck mode — when the brain is unavailable, tools return "fall back to Read/Grep" hints. Zero-Risk Contract: never block the user's workflow.
  • Worker subprocess — long ingest holds a write lock. The reader threads use arc-swap for wait-free snapshot reads so dashboards don't freeze during a 30-min K8s ingest.
  • Atomic settings.json — temp file + rename so a kill-9 mid-write leaves either old or new complete file, never a half-edited JSON.
06Tech stack — what we shipped

Rust 2021 edition, in-process, local-first.

The MCP server (argosbrain-mcp) is a thin stdio shim over the brain library. The brain (neurogenesis-core) is a petgraph + HNSW + bincode store. The dashboard (argosbrain-dashboard) is axum + serde_json running on 127.0.0.1:3733 with /api/preflight, /api/symbol_exists, /api/top_hubs, etc. Hooks (argosbrain hook preflight) are inline-command CLI subcommands that POST to the dashboard daemon.

Workspace structure: neurogenesis-core (brain) · neurogenesis-mcp (MCP server + CLI + hooks) · neurogenesis-viz (dashboard) · neurogenesis-ingest (tree-sitter + SCIP backends) · neurogenesis-embedder (jina-code-v2-int8 daemon for prose embedding). Each crate is independently testable; cargo workspace makes refactors structural.

The architecture paper documents the design in detail: Zero-cost graph retrieval →

07Next