Technical Report

M3SHD

Multi-Agent AI Mesh — Full System Documentation

● Author: Fred Wojo ● AI Architect: Archon (Claude) ● System: mesh.demobygrit.com ● Deployment: Staging VPS

6 Active Nodes

18 DB Tables

80+ API Endpoints

17 RBAC Permissions

3 Security Audits

Section 01

Executive Summary

Abstract

M3SHD is a multi-agent collaboration mesh that lets a single human operator command a distributed team of AI agents from anywhere — iMessage, a PWA dashboard, a native iOS app, or a desktop browser. The system spans six heterogeneous nodes across desktop, mobile, and cloud, coordinated by a lightweight hub that routes tasks without running AI inference itself. All intelligence lives at the edges.

The system spans six nodes: Archon (M2 Mac Mini, primary AI and iMessage bridge), Rex (Intel Mac Mini, research), Crucible (iMac 27", overflow compute), N0D3-1 (iPhone 14 Pro Max, mobile AI worker), N0D3-2 (iPhone 12 Pro, second mobile worker), and the M3SHD Commander node (dispatch and monitoring command center). Desktop nodes connect over Tailscale; mobile nodes connect via the public HTTPS endpoint.

The platform grew through eight iterative phases, each building on a tested foundation before proceeding. The codebase now spans 20+ test files and covers security, intelligence, observability, mobile, and cross-hub federation.

Intelligence Layer

Agents maintain persistent encrypted memory: they remember facts across tasks, auto-extract key findings tagged [REMEMBER], and have those memories automatically injected into future task prompts. Tasks can be chained into pipelines with dependency graphs, so the output of a research task flows automatically into a summarization task when the first completes. Task templates allow one-tap dispatch of recurring workflows with variable substitution. A plugin system exposes tools (web search, file summary, notification, memory enhancement) to agents via a structured lifecycle. Self-evolving agents track their own performance and submit prompt amendment proposals for operator review.

AGI-Adjacent Capabilities

The AGI-adjacent intelligence layer adds capabilities uncommon in production agent systems at this scale: metacognition (confidence scoring with auto-verification on low-confidence outputs), smart model routing (Haiku/Sonnet/Opus selected per task complexity), natural language mesh control, agent reputation scoring via UCB formula, intrinsic motivation for idle agents, adversarial self-improvement, a shared world model (entity graph auto-extracted from task outputs), the debate protocol (two agents argue, synthesis merges the strongest positions), cryptographic provenance (HMAC-SHA256 signed task chains), memory consolidation, and collective voting on ambiguous decisions.

Platform Capabilities

The platform provides RBAC (17 permissions, 4 role presets), multi-user support (admin/user/viewer with PBKDF2 sessions), third-party API keys with rate limits and daily caps, cross-mesh federation (hub-to-hub relay with overflow routing and loop prevention), FCM push notifications (Firebase HTTP v1, silent wakeup on task dispatch), a demo mode (public read-only dashboard), an MCP server (17 tools, wired into Claude Desktop and Claude Code natively), voice command dispatch (Siri Shortcuts webhook), a 3D WebGL mesh visualization, and Android build support for both apps.

Security posture: Multiple independent security audits have been conducted across the full codebase, both mobile apps, and git history. All findings have been remediated and independently verified. The system holds a clean security bill of health at current deployment.

Section 02

Architecture Overview

M3SHD follows a hub-and-spoke topology extended with peer federation. The hub is a stateless relay and persistence layer that does not run AI inference. All intelligence lives at the edges: worker processes on individual machines invoke Claude CLI as a subprocess and report results back to the hub via REST API, while mobile workers invoke the Claude API directly. Desktop workers additionally have access to the full plugin system, memory store, and intelligence layer. This design means the hub can run on a modest VPS (256MB RAM, 0.5 CPU) while workers leverage the full compute of their host machines.

The hub exposes over 80 API endpoints over HTTPS, authenticated by RBAC-scoped agent tokens or session cookies. Real-time event delivery uses Server-Sent Events (SSE), which avoids WebSocket complexity while providing push notifications for all system events. The SSE bus implements backpressure via per-subscriber queue caps of 500 events and sends a keepalive every 30 seconds.

Workers connect to the hub over Tailscale (desktop nodes) or the public HTTPS endpoint (mobile nodes and federated peer hubs). The Claude CLI on desktop workers runs with --dangerously-skip-permissions; the bearer token and Tailscale network boundary are the security perimeter. Mobile workers are API-only — no shell, no filesystem, no tool use.

                    ┌─────────────────────────────────────────────┐
                    │              M3SHD Commander Hub             │
                    │          mesh.demobygrit.com                 │
                    │                                              │
                    │  FastAPI + SSE + SQLite WAL                  │
                    │  Task Queue + Agent Registry                  │
                    │  RBAC + Multi-User Auth                       │
                    │  Agent Memory (FTS5, encrypted)              │
                    │  Task Dependencies + Pipelines               │
                    │  Plugin System (4 built-in tools)            │
                    │  Self-Evolving Agents                        │
                    │  AGI Intelligence Layer                      │
                    │  Analytics Dashboard                         │
                    │  Webhook System + Federation                 │
                    │  MCP Server (17 tools)                       │
                    │  3D WebGL Mesh Visualization                 │
                    └──────────────────┬──────────────────────────┘
                                       │
          ┌────────────────────────────┼────────────────────────────┐
          │                            │                            │
   ┌──────┴──────┐              ┌──────┴──────┐            ┌────────┴────────┐
   │ Mesh Daemon │              │ Rex Worker  │            │ Crucible Worker │
   │ (M2 Mac)    │              │ (Intel Mac) │            │ (iMac 27")      │
   │             │              │             │            │                 │
   │ Archon AI   │              │ Rex AI      │            │ Worker AI       │
   │ iMsg Bridge │              │ Research    │            │ Overflow        │
   │ Full Tools  │              │ File Ops    │            │ Research        │
   │ 5 slots     │              │ 2 slots     │            │ 3 slots         │
   └─────────────┘              └─────────────┘            └─────────────────┘

   N0D3-1 / N0D3-2 (iPhones) ──── Claude API ──────────► Hub
   M3SHD Commander App ──────────── dispatch/monitor ───► Hub
   Peer Hubs ────────────────────── federation relay ───► Hub

The mesh daemon on the M2 Mac Mini is the critical integration point. It runs four persistent threads plus a session bridge daemon: T1 reads iMessage chat.db and relays texts to the hub; T2 polls the hub for agent messages and delivers them via osascript; T3 is the Archon AI watcher that responds using Claude; T4 is a Rex agent thread; and the session bridge daemon listens to SSE and writes broadcast files that Claude Code hooks can read between task polls.

Section 03

Hub Server

The hub is implemented as a FastAPI application in app/main.py. It serves authentication, API routing, SSE broadcasting, the web UI, and the admin control plane.

Authentication

The hub supports three parallel auth mechanisms: (1) RBAC agent tokens (m3shd_ag_ prefix) with scoped permissions per the RBAC system, (2) third-party API keys (m3shd_key_ prefix) with per-key rate limits and daily caps, and (3) browser sessions via PBKDF2-hashed user accounts. The master token bypasses RBAC (all permissions). All token comparisons use hmac.compare_digest.

Security Headers

SecurityHeadersMiddleware injects on every response: Content-Security-Policy, Strict-Transport-Security (max-age 31536000; includeSubDomains), X-Frame-Options: DENY, X-Content-Type-Options: nosniff, Referrer-Policy: strict-origin-when-cross-origin, and Permissions-Policy: camera=(), microphone=(), geolocation=(). Login rate limiting is enforced at /api/login: 5 failures per 5-minute window per IP returns 429.

Background Sweepers

Three background tasks run on a schedule: (1) the agent sweeper marks workers offline if their heartbeat exceeds the configurable timeout, (2) the zombie task sweeper resets tasks stuck in running or assigned for more than 10 minutes back to queued, and (3) the agent evolution sweeper runs every 6 hours to analyze per-agent performance and generate prompt amendment proposals. A task deadline sweeper runs every 60 seconds to check for overdue tasks, increment their urgency, and fire ntfy alerts.

SSE Bus

The SSE bus broadcasts across 30+ event types covering all system activity. Keepalive events fire every 30 seconds. Per-subscriber queue caps at 500 events enforce backpressure. Plugin lifecycle hooks run on task_created, task_completed, task_failed, agent_online, and agent_offline events.

Section 04

Worker System

The generic worker (mesh-worker.py) is the mechanism by which any machine joins the mesh with a single command. It uses only urllib.request for HTTP and threading for concurrency — no FastAPI dependency.

The worker lifecycle: register with capabilities, enter a poll loop with heartbeats, check active task count and system load average before fetching tasks, spawn a daemon thread per task, invoke Claude CLI via subprocess, stream partial output to the hub via POST /api/tasks/{id}/stream, post final output and cost log, and report done.

result = subprocess.run(
    ["claude", "--print", "--dangerously-skip-permissions"],
    input=prompt,
    capture_output=True,
    text=True,
    timeout=self.claude_timeout,
)

Memory integration. After task completion, the worker scans output for [REMEMBER] blocks and POSTs them to /api/agents/{id}/memory. On task pickup, relevant memories are already injected into the prompt by the hub.
Plugin tool calls. Workers scan output for [TOOL_CALL] ... [/TOOL_CALL] markers and invoke the corresponding plugin tool, incorporating the result into the next Claude invocation.
Capability routing. The required_capability column on tasks ensures a task dispatched for code_review only goes to agents that advertise that capability.
Escalation. After three consecutive failures, the worker triggers an escalation. Rex escalates to Archon, Archon escalates to the user. Daily token budgets cap spending at 500,000 tokens per agent.

Section 05

Mesh Daemon

The mesh daemon (mesh-daemon.py) runs on the M2 Mac Mini and integrates four subsystems: the iMessage relay, the hub-to-iMessage relay, the Archon AI watcher, and the Rex agent. All four threads share a shutdown_event and are monitored by a health supervisor.

Session Bridge

A standalone session-bridge.py daemon runs alongside the mesh daemon. It maintains a persistent SSE connection to the hub, and whenever a broadcast-worthy event arrives (new human message, agent output, system alert), it writes a timestamped file to ~/m3shdup/broadcasts/. Active Claude Code sessions check this directory between task polls via the broadcast-check.sh UserPromptSubmit hook. The mesh-post.sh script allows posting from any terminal session to hub chat.

Resilience

The resilient_loop wrapper restarts a crashed thread after 5 seconds of backoff, up to 10 times. After all threads exhaust their budget, the main health supervisor exits and the outer start-mesh.sh shell loop relaunches the entire daemon in 5 seconds. Individual crash recovery: 5 seconds. Full process recovery: 10 seconds. The daemon auto-starts via .zprofile on terminal open. A lockfile prevents duplicate instances.

Section 06

iMessage Bridge

The iMessage bridge reads Apple's chat.db in read-only mode and sends replies via AppleScript's osascript. It is the most unconventional component in the system.

Thread 1 (iMessage to Hub) polls chat.db every 3 seconds, filtering for incoming messages from the user's number. Bridge-echo loops are prevented by checking for the relay prefix regex. Image attachments undergo path restriction, symlink resolution, 20MB cap, and magic byte verification (JPEG, PNG, GIF, WebP, HEIC/HEIF). Failed POSTs are queued in a deque and retried on the next cycle.
Thread 2 (Hub to iMessage) polls the hub every 5 seconds for new agent messages and delivers them via osascript. The message content is always passed via argv — never interpolated into AppleScript — preventing injection. Messages over 3,000 characters are truncated at a word boundary.
Schema version detection. The bridge includes an iMessage schema abstraction layer that detects the chat.db schema version at startup and normalizes reads accordingly, handling schema differences across macOS versions without requiring code changes.

Section 07

Command Interface

The user communicates with the mesh by texting commands from an iPhone. The command parser recognizes 14 core command patterns:

Command	Mode	Description
status	status	Show agent online/offline states and active task counts
tasks	tasks	List the 10 most recent tasks with status icons
@context	context_get	Show all shared context key-value pairs
@context key=val	context_set	Set a shared context entry
@rex <text>	rex_message	Dispatch a task directly to Rex
approve	approval	Approve the most recent pending approval
reject	approval	Reject the most recent pending approval
pending	pending_approvals	List all pending approval requests
costs / @costs	costs	Show today's token usage and cost by agent
escalations	escalations	List open escalations
do <text>	task	Execute a task via Archon with full context
auto <goal>	autonomous	Decompose goal into subtasks, dispatch to Rex
summary	summary	Summarize recent task results
(anything else)	chat	Free-form conversation with Archon

Autonomous mode decomposes a high-level goal into 2–5 concrete subtasks via Claude, dispatches each to Rex in parallel, and delivers a summary with a tasks link for tracking. Voice commands via Siri Shortcuts provide an additional dispatch path — POST /api/voice/dispatch parses natural language and routes to the matching task template.

Section 08

Task Dispatch Engine

The dispatcher (app/dispatch.py) implements capacity-aware routing with circuit-breaker protection, extended with reputation-based selection and deadline urgency.

Core Algorithm

If assign_to is specified, verify capacity and closed circuit. If not, fall through.
Query available workers for all online/busy agents with spare capacity advertising the required capability.
Filter out agents with open circuit breakers.
Apply reputation scoring (UCB formula across per-capability success histories).
Select highest-reputation agent with available capacity.
If no worker is available, create task with status=queued.

Circuit Breaker

Trips after 3 consecutive failures. Cooldown: 60 seconds. Half-open state on recovery: one task allowed through; success closes the circuit, failure trips it again immediately.

Reputation Scoring

Each agent maintains per-capability success/failure history. The UCB (Upper Confidence Bound) formula balances exploitation (agents with high success rates) with exploration (agents that haven't been tried recently for a given capability). New agents begin with a neutral prior.

Deadlines

Tasks created with a deadline timestamp are monitored by the 60-second sweeper. Overdue tasks have their urgency level incremented (low → medium → high → critical) and trigger ntfy escalation alerts. Priority is bumped automatically on urgency escalation.

Section 09

Safety and Control Layer

M3SHD provides five interlocking safety mechanisms: the approval queue, the escalation chain, cost tracking, RBAC, and metacognition.

Approvals. Any agent can create an approval request via POST /api/approvals. The user responds with approve or reject. Expired approvals are automatically cleaned up.
Escalations. Three consecutive task failures trigger an escalation. Rex escalates to Archon, Archon escalates to the user. Each escalation record includes agent, target, task ID, and reason.
Cost Tracking. Every Claude invocation logs estimated token usage. Daily limit: 500,000 tokens per agent. The budget ntfy alert fires at $1/day per agent.
ntfy Alerting. Four alert types: agent-down (once per outage), task-failed (on third attempt), escalation (on escalation creation), and budget ($1/day threshold). Fire-and-forget via run_in_executor — never blocks the request path.
Metacognition. Every task result carries a confidence score (0.0–1.0). Tasks with confidence below 0.7 are automatically submitted for verification by a second agent before the result is accepted. Operators can see confidence scores in the analytics dashboard.

Section 10

Web UI

The web UI is a dark-themed, mobile-first PWA built without any JavaScript framework. CSS custom properties handle theming; JetBrains Mono is the primary typeface; the brand gradient is amber → purple → emerald.

Dashboard. Agent cards showing name, machine, online/offline status, and task utilization. Status summary bar shows total online agents, active tasks, and pending approvals.
Chat. Full-screen chat interface with color-coded message bubbles: amber (user), purple (Archon), emerald (Rex). Real-time SSE updates. 16px input font prevents iOS Safari zoom-on-focus.
Tasks. Kanban-style task list. Template chips allow one-tap dispatch of seed templates. Each task card shows title, assignee, status badge, confidence score, and deadline indicator if set.
Logs. Live log stream fed by SSE. Timestamped and color-coded by event type.
3D Mesh Visualization. WebGL force graph. Nodes colored by status (green=online, amber=busy, red=offline) and sized by capacity. Active tasks render as glowing particles traveling between nodes. The graph auto-rotates and is interactive.
Analytics. Admin-only tab. Charts for task throughput by status, agent, and day; cost trends; uptime and memory counts per agent.
Demo Mode. When enabled, GET /demo serves a public read-only dashboard. Rate limited at 30 req/min per IP.

Section 11

N0D3 Mobile Worker

N0D3 is a Flutter iOS app that turns any iPhone or iPad into a live M3SHD worker node. The second instance, N0D3-2, runs on an iPhone 12 Pro.

Architecture

The app implements the full worker contract: register, heartbeat, poll, execute, report. It calls the Claude API directly (api.anthropic.com) using the user's API key stored in iOS Keychain via flutter_secure_storage. State management via Flutter Riverpod (Notifier / AsyncNotifier). GoRouter navigation with three routes: splash, setup, and main.

Real Claude Streaming

N0D3 uses anthropic_sdk_dart for real SSE streaming via client.messages.createStream(). The onChunk callback forwards partial output to the hub in real-time as the model generates, delivering genuine token-by-token updates to the dashboard and Commander app.

Offline Task Queue. If the hub is unreachable when a task completes, the result is saved locally. On reconnect, a sync sweep POSTs all pending results.
Capabilities. research, summarize, chat, triage. Max 1 concurrent task. Text-in, text-out only.
Background Mode. FCM silent push (Firebase Cloud Messaging) wakes the app the moment a matching task is dispatched, reducing task start latency from minutes to seconds.
Lifecycle-Aware Heartbeats. 10s (WiFi, foreground), 30s (backgrounded), 60s (cellular).
Task Handoff. On 30 seconds of failed reconnection during active task execution: save partial output locally, reconnect, call POST /api/tasks/{id}/handoff. Hub suspends original task and creates continuation for the next available agent.
UI Design. Glassmorphism: frosted glass cards via BackdropFilter, gradient borders, live status indicator. All colors via MeshColors and MeshGradient in theme.dart.

Config	Value
Bundle ID	com.gritwerk.meshNode
Minimum iOS	15.0
Display Name	N0D3
Network	NSAllowsLocalNetworking: true
Android APK	49MB

Section 12

M3SHD Commander App

The M3SHD Commander is a five-tab native iOS command center (25 Dart files). It registers with maxConcurrent: 0 — the dispatcher never assigns it tasks to execute.

Tab	Path	Screen
0	/	Dashboard — agent grid, online status, utilization
1	/chat	Chat — full mesh chat, keyboard padding fixed
2	/tasks	Tasks — create tasks, template chips, view queue; FAB above tab bar
3	/logs	Logs — filtered SSE log stream
4	/settings	Settings — hub URL, token, commander name

State Providers

settingsProvider — NotifierProvider<SettingsNotifier, AppSettings>
hubTokenProvider — FutureProvider<String> (iOS Keychain)
hubConnectionProvider — NotifierProvider (heartbeat + SSE lifecycle)
agentsProvider — AsyncNotifierProvider (fetches immediately on connect)
messagesProvider and tasksProvider — real-time via SSE

SSE integration reconnects with exponential backoff. The hub connection provider tears down the SSE stream on device-offline and reconnects on return. All colors from MeshColors.*. Touch targets: 44px minimum throughout.

Config	Value
Bundle ID	com.gritwerk.m3shdup
Minimum iOS	15.0
Background modes	fetch, remote-notification
Network	NSAllowsLocalNetworking: true

Section 13

Task Handoff System

The task handoff system ensures no work is lost when an agent disconnects mid-task.

Endpoint. POST /api/tasks/{id}/handoff accepts optional partial_output. The endpoint: loads the original task; updates status to suspended, storing partial output in the output field (capped at 32KB, with ownership check); creates a continuation task at bumped priority with a prompt prepended by CONTINUE FROM PREVIOUS AGENT'S PARTIAL WORK:; dispatches via the standard dispatcher; and broadcasts a task_handoff SSE event.

N0D3 triggers handoff automatically on 30 seconds of failed reconnection. Desktop workers can call it explicitly when approaching token budget limits. The required_capability of the original task is preserved in the handoff task so the continuation lands on a capable agent.

Section 14

Agent Memory System

The agent memory system gives agents persistent, searchable, encrypted memory across tasks.

Storage

agent_memory table with UNIQUE(agent_id, key). Values are encrypted at rest using the hub's secret key. An FTS5 virtual table (agent_memory_fts) stores plaintext copies for full-text search.

API

POST /api/agents/{id}/memory — store or update a memory entry
GET /api/agents/{id}/memory — list all memories for an agent
GET /api/agents/{id}/memory/search?q=<query> — FTS5 full-text search
DELETE /api/agents/{id}/memory/{key} — delete a specific entry

Auto-Extract & Auto-Inject

Workers scan task output for [REMEMBER] key: value [/REMEMBER] blocks and POST them automatically. When the hub creates a task prompt, get_memory_context(agent_id, task_text) performs an FTS5 search against the task text and prepends top-K matching memories. Agents thus "remember" relevant prior facts without explicit operator configuration.

Memory Consolidation

A "sleep" function sweeps all agent memories, merges duplicates, resolves contradictions (later entry wins unless confidence scores differ), and discovers cross-agent patterns. Results are written back as consolidated memory entries tagged with source: consolidation. FTS5 query strings are sanitized to alphanumeric words before reaching the FTS engine, preventing malformed syntax from crashing the shared database connection.

Section 15

Task Dependencies and Pipelines

Tasks can declare dependencies on other tasks, forming execution DAGs.

Schema. task_deps junction table (parent_task_id, child_task_id). Circular dependency prevention uses a recursive CTE that walks the ancestor chain before insertion. Tasks created with depends_on: [id1, id2] start in queued state regardless of dispatcher availability.

Auto-dispatch. When a task reaches done, check_and_dispatch_dependents() runs. It finds all child tasks whose parents are all in done state, injects parent output into the child prompt, and dispatches. This chains arbitrarily deep without operator involvement.

Pipelines. POST /api/pipelines accepts a list of task definitions and wires them sequentially: task N's completion dispatches task N+1 with N's output injected. Use cases: research → summarize → notify; code_write → code_review → deploy.

Section 16

Task Templates

Task templates allow one-tap dispatch of recurring workflows with variable substitution.

Schema. task_templates table with fields: id, name, description, prompt_template (with {variable} placeholders), capability, priority, created_by.

Seed templates: Research ({topic}), Summarize ({target}), QA ({target}), Code Review ({target}), Write ({topic}).

POST /api/templates — create a template
GET /api/templates — list all templates
DELETE /api/templates/{id} — remove a template
POST /api/templates/{id}/dispatch — dispatch with variable substitution

The Commander app shows a horizontal template chip row on the Tasks screen. Tapping a chip opens a bottom sheet for variable input, then dispatches. The voice dispatch endpoint also matches natural language to the most appropriate template.

Section 17

Plugin System

The plugin system allows extending agent capabilities with structured tools callable during task execution.

Architecture. app/plugins.py implements PluginManager with three registries: tool functions, lifecycle hooks, and capability declarations. Plugin files in the plugins/ directory expose a setup(manager) function that registers with the manager on hub startup.

Built-in Plugins

web_search — searches the web and returns structured results
file_summary — summarizes a file at a given path (hub-side, path-sanitized)
notify — fires an ntfy push notification from within a task
memory_enhance — performs an FTS5 search against agent memories and returns matches

Tool invocation. Workers scan task output for [TOOL_CALL] {"tool": "web_search", "query": "..."} [/TOOL_CALL] markers, POST to /api/plugins/{tool}/invoke, and incorporate the result into the next Claude invocation. Plugin responses strip filesystem paths from the output.

Section 18

Self-Evolving Agents

Agents track their own performance and can iteratively improve their system prompts.

Performance Tracking

app/evolution.py accumulates per-agent, per-capability success/failure statistics across a configurable rolling window (default: 7 days). Each task completion updates the agent's performance record.

Prompt Optimizer

Every 6 hours, the evolution sweeper analyzes each agent's performance patterns. For agents with meaningful data, it calls Claude Haiku with the agent's current guidelines and performance summary, asking for typed amendments: add (new guidance for failure modes), reinforce (strengthen guidance that correlates with success), restrict (narrow scope of problematic patterns), or remove (delete guidance that correlates with failure).

Operator Approval

Proposed amendments appear in the hub dashboard with confidence scores. The operator reviews and applies via POST /api/agents/{id}/evolve with the amendment ID. Applied amendments are prepended to the agent's system prompt in an EVOLUTION GUIDELINES: block.

Section 19

AGI-Adjacent Intelligence Layer

The intelligence layer adds a stack of capabilities uncommon in production agent systems at this scale. These features are active on the staging deployment with the Anthropic API key in place.

Metacognition

Every task result includes a structured confidence score (0.0–1.0) computed by Claude during response generation. Tasks below 0.7 confidence are automatically submitted for verification: a second agent re-evaluates independently, and the higher-confidence result is accepted. Operators see confidence in task detail views and analytics.

Smart Model Routing

Heuristic analysis of task text selects Claude Haiku (simple/short), Sonnet (standard), or Opus (complex/critical) at dispatch time. Factors include: task length, keyword signals, capability type, and current agent load. Operators can override per-task.

Natural Language Mesh Control

POST /api/mesh/control accepts free-form natural language instructions ("move rex to code-review only", "take crucible offline for maintenance", "set archon max concurrent to 3"). Claude Haiku interprets the intent and emits a structured operation that the hub executes.

Agent Reputation Scoring

UCB (Upper Confidence Bound) formula applied to per-capability success/attempt histories. Dispatch selects the highest-UCB agent for each capability, balancing exploitation of known-good agents with exploration of underutilized ones. New agents receive a neutral prior.

Intrinsic Motivation

Idle agents proactively select from a pool of autonomous task types: audit memory for stale entries, verify recent task outputs, refresh cached research, check system health. Cooldown periods per task type prevent repeated invocations, keeping idle capacity productive.

Adversarial Self-Improvement

After a task completes, a challenger agent is given the original prompt and the primary agent's output. The challenger is instructed to find weaknesses, errors, or gaps. If the challenger produces meaningful critique, the primary agent performs a revision loop incorporating the feedback. The final output includes both the revision and the challenger's critique score.

World Model (Entity Graph)

Shared entities and entity_relations tables. Task outputs are processed by an extraction agent that identifies named entities (people, projects, organizations, tools, concepts) and relations between them. Agents can query the world model before executing tasks to ground their responses.

Debate Protocol

For high-stakes tasks (flagged requires_debate: true), two agents receive the same prompt and generate independent responses. A synthesis agent receives both responses and produces a merged output incorporating the strongest points from each position. Operators can view the debate thread in the task detail view.

Cryptographic Provenance

Every task result is signed with HMAC-SHA256: sign(prev_hash || timestamp || agent_id || task_id || output_hash). The chain starts from a genesis hash (64 zeros). Any tampered output breaks the chain, detectable by verifying the hash sequence. GET /api/tasks/{id}/provenance returns the full chain.

Collective Voting

When a task result is flagged requires_vote: true, the hub polls all available agents for a structured binary vote (accept/reject) with rationale. After a configurable quorum (default: 3 votes), majority rules. Tied votes escalate to the operator. The jury roster and individual votes are stored for audit.

Benchmark Suite

POST /api/admin/benchmark runs six performance tests sequentially: task dispatch throughput, concurrent task handling, FTS5 memory search latency, SSE event fan-out, federation relay latency, and end-to-end task completion time. Results are saved to the data directory for trend analysis.

Section 20

RBAC and Multi-User Support

RBAC — 17 Permissions

Permission Group	Permissions
Tasks	tasks:read, tasks:write, tasks:stream
Agents	agents:read, agents:write, agents:admin
Messages	messages:read, messages:write
Memory	memory:read, memory:write
Admin	admin:analytics, admin:webhooks, admin:users, admin:tokens, admin:plugins, admin:federation
Voice	voice:dispatch

4 Role Presets

Role	Permissions
worker	tasks:read/write/stream, agents:read, messages:read/write, memory:read/write, voice:dispatch (10)
mobile	Same as worker minus voice:dispatch (9)
commander	All worker permissions + agents:write, admin:analytics (12)
admin	All 17 permissions

Security note: Empty permissions on an agent token means no access — not full access. This was remediated during the security audit cycle. The master token bypasses RBAC entirely and is reserved for administrative operations.

Multi-User Support

users table with PBKDF2-HMAC-SHA256 password hashing (310,000 iterations). Session tokens have a 24-hour TTL. Three user roles: admin, user, viewer. Dual-path login: username + password or master token. A seed admin user is created on first run.

Third-Party API Keys

api_keys table with m3shd_key_ prefix. Per-key rate limits (requests/minute) and daily caps (requests/day). Usage tracking per key. Separate auth path from agent tokens. Admin CRUD via /api/admin/keys.

Section 21

Webhooks and Federation

Webhooks

webhooks table with encrypted secrets. Endpoints: create, list, delete, generic trigger (secret validation via X-Webhook-Secret header only — query params rejected), and GitHub-specific trigger (HMAC-SHA256 signature validation on X-Hub-Signature-256). Rate limiting: 10 requests/minute per webhook. Uptime Kuma auto-detected from X-Uptime-Kuma-Agent header.

Cross-Mesh Federation

peer_hubs table. app/federation.py implements hub-to-hub task relay. When local capacity is exhausted for a required capability, the dispatcher queries peer hubs for availability and relays the task. The relay sweeper polls peers for completed results and applies them to the local task record. Relay hop count tracked in task metadata; max 3 hops prevents relay loops.

POST /api/admin/peers — register a peer hub (URL, auth token)
GET /api/admin/peers — list peers
DELETE /api/admin/peers/{id} — remove peer
POST /api/federation/tasks — incoming relay endpoint

Section 22

Analytics and Observability

Four analytics endpoints, all scoped by days query parameter (default: 7):

GET /api/admin/analytics/tasks — task counts by status, by agent, by day, by capability
GET /api/admin/analytics/costs — cost by agent, by day, average cost per task per agent
GET /api/admin/analytics/agents — uptime percentage, total tasks, memory count per agent
GET /api/admin/analytics/summary — combined quick-view for dashboard header

ntfy Alerting

Alert	Trigger	De-duplication
Agent down	Heartbeat timeout	Once per outage; suppressed until agent recovers
Task failed	3rd consecutive attempt	Per task ID
Escalation	Escalation creation	Per escalation ID
Budget	$1/day threshold	Once per UTC day per agent

All alerts fire via asyncio.run_in_executor — fire-and-forget, never blocking the request path. The broadcast-check.sh UserPromptSubmit hook wires the M3SHD message bus into every active Claude Code session on any mesh node.

Section 23

CI/CD Pipeline

deploy.sh

./deploy.sh [--dry-run]
# 1. pytest gate — fail on any test failure
# 2. rsync source to staging VPS (excluding data/, .env)
# 3. docker compose up -d --build m3shdup
# 4. health check (/api/health) — fail and alert if unhealthy

Pre-commit hook. .githooks/pre-commit runs pytest before every commit. Failed tests block the commit. setup-hooks.sh installs the hook with one command.
GitHub Actions. .github/workflows/test.yml runs the full test suite on every push and pull request. Matrix: Python 3.12. Steps: checkout, install dependencies, run pytest with coverage report.

Section 24

Database Schema

The hub uses SQLite in WAL mode at data/m3shdup.db. The schema spans 18 tables.

Table	Purpose	Key Columns
messages	Chat history	id, ts, sender, sender_type, content, channel, reply_to
agents	Worker registry	id, name, machine, status, capabilities, max_concurrent, active_tasks
tasks	Task queue	id, title, prompt, assigned_to, status, priority, output, attempts, required_capability, deadline, urgency, confidence, provenance_hash
task_deps	Dependency graph	parent_task_id, child_task_id
task_templates	Reusable workflows	id, name, description, prompt_template, capability, priority
approvals	Permission requests	id, agent, action, description, status, decided_by, timeout
context	Shared KV store	key (PK), value, set_by
cost_log	Token usage	agent, task_id, input_tokens, output_tokens, model, cost_usd
escalations	Failure chain	from_agent, to_agent, task_id, reason, status
agent_memory	Persistent agent memory (encrypted)	agent_id, key, value, created_at, updated_at
agent_memory_fts	FTS5 full-text index	Virtual table over plaintext memory
agent_tokens	RBAC-scoped auth tokens	hash, agent_id, permissions (JSON), created_at
api_keys	Third-party access	hash, name, rate_limit, daily_cap, usage_today
users	Multi-user accounts	id, username, password_hash, role, created_at
webhooks	Inbound webhook definitions	id, name, url, secret (encrypted), event_filter
peer_hubs	Federated mesh peers	id, url, token (encrypted), status, relay_count
entities	World model nodes	id, name, type, description, source_task_id
entity_relations	World model edges	from_entity_id, to_entity_id, relation_type, weight

Status enums are CHECK-constrained. Foreign keys enabled via PRAGMA foreign_keys=ON. WAL mode configured at startup. All schema changes applied via migration guards (PRAGMA table_info check before ALTER TABLE).

Section 25

API Reference

All endpoints except /api/health and /api/login require Authorization: Bearer <token> or m3sh_session cookie. Admin endpoints additionally require RBAC admin:* permissions.

Core Endpoints

Method	Path	Description
POST	/api/login	Authenticate, set session cookie (rate-limited: 5/5min per IP)
GET	/api/stream	SSE event stream (30s keepalive)
POST	/api/messages	Send a message (50KB max)
POST	/api/tasks	Create and dispatch task
PUT	/api/tasks/{id}	Update task status/output
POST	/api/tasks/{id}/stream	Append to task stream log
POST	/api/tasks/{id}/handoff	Suspend + create continuation
POST	/api/agents/{id}/register	Self-register worker
POST	/api/agents/{id}/heartbeat	Keep-alive ping
GET/PUT/DELETE	/api/context	Shared KV store
POST/GET/PUT	/api/approvals	Approval queue
POST/GET/PUT	/api/escalations	Escalation chain
GET	/api/health	Health check (unauthenticated)

Extended Endpoints

Method	Path	Description
POST/GET/DELETE	/api/agents/{id}/memory	Agent memory CRUD
GET	/api/agents/{id}/memory/search	FTS5 memory search
POST	/api/agents/{id}/evolve	Apply evolution amendment
POST	/api/pipelines	Create chained task pipeline
POST/GET/DELETE	/api/templates	Task template CRUD
POST	/api/templates/{id}/dispatch	Dispatch template with vars
POST	/api/plugins/{tool}/invoke	Invoke plugin tool
POST	/api/voice/dispatch	Voice/NL task dispatch
POST	/api/mesh/control	Natural language mesh control
GET	/api/tasks/{id}/provenance	Task provenance chain
GET/POST/PUT/DELETE	/api/admin/users	User management (admin)
GET/POST/DELETE	/api/admin/webhooks	Webhook CRUD (admin)
GET/POST/DELETE	/api/admin/peers	Federation peer management (admin)
POST	/api/federation/tasks	Incoming federated task
GET/POST/DELETE	/api/admin/keys	Third-party API key management (admin)
GET	/api/admin/analytics/tasks	Task analytics (admin)
GET	/api/admin/analytics/costs	Cost analytics (admin)
GET	/api/admin/analytics/agents	Agent analytics (admin)
POST	/api/admin/benchmark	Run benchmark suite (admin)
GET	/demo	Public demo dashboard

SSE events cover 30+ types: all core types plus task_handoff, memory_stored, evolution_proposed, plugin_invoked, federation_relay, vote_called, vote_result, debate_started, debate_result, world_model_updated, provenance_verified, benchmark_complete, and others.

Section 26

Security Posture

M3SHD has undergone multiple independent security audits across the full codebase, both mobile apps, and version history. All findings have been remediated and verified. The system currently holds a clean security bill of health.

Security Controls

All secrets via environment variables — no secrets in source code
RBAC on all API endpoints: 17 permissions, 4 role presets
Empty permission array on agent token = no access
Bearer token comparison via hmac.compare_digest (constant-time)
PBKDF2-HMAC-SHA256 for user passwords (310,000 iterations)
Session tokens 24-hour TTL; master-token-only for new admin token creation
SecurityHeadersMiddleware: CSP, HSTS (1 year + includeSubDomains), X-Frame-Options DENY, nosniff, Referrer-Policy, Permissions-Policy
Login rate limiting: 5 failures per 5-minute window per IP → 429
Input length limits: messages 50KB, prompts 8KB, osascript 3KB, handoff partial_output 32KB, stream chunks 64KB
Task handoff IDOR: ownership check enforced before allowing handoff
Task stream IDOR: ownership check enforced before accepting output injection
FTS5 sanitization: query strings sanitized to alphanumeric words before FTS engine
Plugin path disclosure: filesystem paths stripped from plugin error responses
Webhook secrets: X-Webhook-Secret header only; query param returns 400
GitHub HMAC: X-Hub-Signature-256 HMAC-SHA256 validation
osascript injection prevention: user content via argv, never interpolated
Image attachment validation: path restricted, symlinks resolved, magic bytes verified, 20MB cap
Agent ID sanitization: [a-z0-9-] regex enforced at registration
Container runs as non-root user (appuser)
SSE queue capped at 500 events per subscriber; query limits capped at 500
SSH key-only authentication on all mesh nodes
NSAllowsLocalNetworking: true (not NSAllowsArbitraryLoads) on iOS apps

Audit outcome: Multiple independent red team audits conducted across the full codebase, both mobile apps, and git history. All findings from all audit cycles have been remediated and independently verified. Production tokens are patched with explicit RBAC permissions.

Section 27

Container and Deployment

The hub runs as a single Docker container on the staging VPS at mesh.demobygrit.com.

FROM python:3.12.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
RUN adduser --disabled-password --no-create-home appuser
COPY . .
RUN chown -R appuser:appuser /app
USER appuser
EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Container Configuration

Port mapping: 8333 (host) to 8000 (container)
Volume: ./data:/app/data (SQLite + benchmark reports + plugin output)
Memory limit: 256MB, CPU limit: 0.5 cores, reservation: 64MB
Restart policy: unless-stopped
Log rotation: 10MB max, 3 files retained
Healthcheck: urllib /api/health every 30 seconds

Backup & Push

Litestream is configured for R2 replication — requires R2 bucket creation and credentials deployed to staging. Firebase service account and FCM env vars are in place; one deploy step from activating silent push on staging. APNs key: AuthKey_W3F6PU2NCQ.p8. Firebase project: m3shd-65a5a.

Key Dependencies

fastapi>=0.115.0 — ASGI framework
uvicorn[standard]>=0.34.0 — ASGI server with uvloop
sse-starlette>=2.2.1 — Server-Sent Events
aiosqlite>=0.21.0 — Async SQLite
cryptography>=42.0 — Memory encryption, HMAC provenance
anthropic>=0.40.0 — Claude API (evolution optimizer, AGI features)
httpx>=0.28.0 — Async HTTP client (federation, webhooks, FCM, ntfy)

Section 28

Test Coverage

The test suite spans 20+ test files covering all subsystems.

File	Coverage Area
test_db.py	Schema init, CRUD for all tables, capacity queries, stale agent sweep
test_api.py	Core API endpoints, auth validation, input limits
test_dispatch.py	Circuit breaker, capacity routing, reputation scoring, deadline urgency
test_worker.py	Worker registration, Claude subprocess mock, task lifecycle
test_security.py	RBAC enforcement, IDOR checks, injection prevention
test_audit.py	Audit trail, provenance chain, HMAC verification
test_memory.py	Agent memory CRUD, FTS5 search, auto-extract, auto-inject, consolidation
test_deps.py	Dependency creation, circular prevention, auto-dispatch, output injection
test_templates.py	Template CRUD, variable substitution, voice dispatch
test_webhooks.py	Secret validation, HMAC, rate limiting, Uptime Kuma
test_rbac.py	Permission enforcement per endpoint, role presets, empty-perms fix
test_analytics.py	All four analytics endpoints, cost rollups, uptime calc
test_plugins.py	Plugin registration, tool invocation, lifecycle hooks, path sanitization
test_evolution.py	Performance tracking, amendment generation, operator approval
test_agi.py	Metacognition, model routing, NL control, reputation, motivation, adversarial, world model, debate, provenance, voting, benchmark
test_federation.py	Peer registration, task relay, hop limits, sweeper
test_fcm.py	FCM token registration, silent push, OAuth2 token caching
test_users.py	PBKDF2 hashing, session TTL, role-scoped access

Tests use pytest-asyncio for async support, in-memory SQLite for isolation, FastAPI TestClient for HTTP, and subprocess mocks for Claude CLI. Test fixture isolation (_db = None resets) prevents cross-contamination across files.

Section 29

Node Inventory

Node	Machine	Role	Max Concurrent	Status
Archon	Mac Mini M2 (Apple M2 8-core)	Primary AI, full tools, iMessage bridge	5	Online
Rex	Mac Mini Intel (i5)	Research, code, file ops	2	Online
Crucible	iMac 27" (Intel i5)	Overflow compute, research	3	Online
N0D3-1	iPhone 14 Pro Max	Mobile AI worker	1	Online
N0D3-2	iPhone 12 Pro	Second mobile AI worker	1	Online
Fred Commander	M3SHD Commander App	Dispatch and monitor	0	Online

RBAC Token Assignments

archon: admin role (17 permissions)
rex, crucible: worker role (10 permissions)
n0d3-1, n0d3-2: mobile role (9 permissions)
fred-commander: commander role (12 permissions)
Third-party integrations: API key auth with per-key caps

All desktop nodes connect via Tailscale. Mobile nodes connect via the public HTTPS endpoint. SSH key authentication on all desktop nodes. Crucible runs macOS Monterey 12.7.6 — Terminal.app (Ghostty incompatible below Ventura). Node.js installed from tarball (Homebrew broken on Monterey). SSL fixed via certifi + SSL_CERT_FILE environment variable.

Section 30

Technology Stack

Layer	Technology	Version	Purpose
Hub Framework	FastAPI	0.115+	ASGI framework, API routing
ASGI Server	Uvicorn	0.34+	Production server with uvloop
Database	SQLite (WAL)	3.x	Persistence (18 tables)
Full-Text Search	SQLite FTS5	3.x	Agent memory search
Async DB Driver	aiosqlite	0.21+	Non-blocking SQLite
SSE	sse-starlette	2.2+	Server-Sent Events (30+ event types)
Cryptography	cryptography	42.0+	Memory encryption, HMAC provenance
AI Runtime (desktop)	Claude CLI	Latest	Subprocess LLM inference
AI Runtime (mobile)	anthropic_sdk_dart	1.4.2	Real SSE streaming on mobile
AI Runtime (hub)	Anthropic Python SDK	0.40+	Evolution optimizer, AGI features
Reverse Proxy	Caddy	2.x	TLS termination, HTTP/2
Container Runtime	Docker	Latest	Hub deployment on VPS
Mesh VPN	Tailscale	Latest	Encrypted node-to-node tunnels
Backup	Litestream	Latest	SQLite WAL replication to R2
Push Notifications	Firebase FCM	HTTP v1	Silent push wakeup for mobile
MCP Server	anthropic_mcp	Latest	17 tools for Claude Desktop/Code
Mobile Framework	Flutter	3.41	iOS + Android apps
State Management	flutter_riverpod	2.6.1	Notifier/AsyncNotifier
Mobile Routing	go_router	14.8.1	Type-safe navigation
Secure Storage	flutter_secure_storage	9.2.4	iOS Keychain, Android Keystore
Bridge (read)	sqlite3 (stdlib)	3.x	chat.db read-only
Bridge (write)	osascript	macOS	iMessage send via AppleScript
Language (backend)	Python	3.12	All backend components

Section 31

Architecture Diagram

┌────────────────────────────────────────────────────────────────────┐
│                             INTERNET                                │
│                                                                     │
│   Fred's iPhone / Browser                                           │
│   ├── iMessage ──────────────────► chat.db (M2 Mac)                │
│   ├── M3SHD Commander app ───────────────────────────────┐         │
│   ├── N0D3 app ──────────────────────────────────────────┤         │
│   ├── Siri Shortcut ─── voice/dispatch ──────────────────┤         │
│   └── MCP (Claude Desktop/Code) 17 tools ────────────────┤         │
│                │ HTTPS                                    │         │
│                ▼                                          │         │
│   ┌──────────────────────────────────────────┐           │         │
│   │        mesh.demobygrit.com               │           │         │
│   │        Caddy (TLS + H2)                  │ ◄─────────┘         │
│   │              ▼                           │                     │
│   │   ┌────────────────────────────────┐     │                     │
│   │   │         M3SHD Hub              │     │                     │
│   │   │  FastAPI + SSE + SQLite WAL    │     │                     │
│   │   │  18 tables, FTS5, WAL mode     │     │                     │
│   │   │  RBAC (17 perms, 4 presets)    │     │                     │
│   │   │  Agent Memory (FTS5, enc.)     │     │                     │
│   │   │  Task Deps + Pipelines         │     │                     │
│   │   │  Plugin System (4 built-in)    │     │                     │
│   │   │  Self-Evolving Agents          │     │                     │
│   │   │  AGI Intelligence Layer        │     │                     │
│   │   │  Webhooks + Federation         │     │                     │
│   │   │  MCP Server (17 tools)         │     │                     │
│   │   │  3D WebGL Visualization        │     │                     │
│   │   └────────────────────────────────┘     │                     │
│   │         Staging VPS (Hetzner)            │                     │
│   └──────────────────────────────────────────┘                     │
│                   │                                                 │
└───────────────────┼─────────────────────────────────────────────────┘
                    │
          ── Tailscale WireGuard VPN ──
                    │
        ┌───────────┼──────────┐
        │           │          │
┌───────┴──────┐ ┌──┴───┐ ┌───┴──────┐
│ M2 Mac Mini  │ │ Rex  │ │ Crucible │
│ (Archon)     │ │Intel │ │ iMac 27" │
│              │ │      │ │          │
│ mesh-daemon  │ │ mesh-│ │ mesh-    │
│ T1: iMsg→Hub │ │worker│ │ worker   │
│ T2: Hub→iMsg │ │      │ │          │
│ T3: Archon   │ │claude│ │ claude   │
│ T4: Rex      │ │print │ │ print    │
│ session-brdg │ │2 con.│ │ 3 con.   │
└──────────────┘ └──────┘ └──────────┘

N0D3-1 (iPhone 14 Pro Max)    N0D3-2 (iPhone 12 Pro)
Flutter + anthropic_sdk_dart   Flutter + anthropic_sdk_dart
Real SSE streaming             Real SSE streaming
Offline queue + FCM wakeup     Offline queue + FCM wakeup
1 concurrent, handoff on disc  1 concurrent, handoff on disc
          │ HTTPS                        │ HTTPS
          └─────────────────────────────┘
                        ▼
                   M3SHD Hub

Data Flow: Siri Task Dispatch

1. "Hey Siri, research StoreKit 2 pricing models"
2. Siri Shortcut POSTs to POST /api/voice/dispatch with NL text
3. Hub parses intent → matches "Research {topic}" template
4. Template rendered with topic="StoreKit 2 pricing models"
5. Dispatcher selects Rex (highest UCB reputation for research)
6. Rex picks up task via poll, invokes Claude CLI
7. Claude returns research (FTS5 memory injected into prompt)
8. Rex scans output for [REMEMBER] blocks → POSTs to agent memory
9. Rex POSTs cost log + task done
10. ntfy fires push: "Research task complete on rex"
11. M3SHD Commander app receives SSE task_updated, shows done badge

Data Flow: Pipeline Execution

1. POST /api/pipelines with 3 task definitions
2. Hub creates task A (research), B (summarize, depends_on=[A]),
   C (notify, depends_on=[B])
3. Only task A is dispatched (B and C status=queued, deps unmet)
4. Rex completes task A → status=done
5. check_and_dispatch_dependents() fires
6. Task B's parent (A) is done → dispatch B with A's output injected
7. N0D3-1 picks up B, summarizes via Claude API
8. Task B completes → dispatch C
9. Archon picks up C → notify plugin fires ntfy push to Fred
10. Fred receives summary push notification

Section 32

Version History

M3SHD was developed across eight iterative phases, each building on a stable, tested foundation before proceeding.

v0.1 — Foundation

Seven-table schema, capacity-aware dispatch with circuit breaker, self-registering workers, generic mesh-worker.py, hardened iMessage bridge with retry queues. Command router with 14 commands. Context-aware rich prompts (chat vs task mode). Rex deployed as mesh worker on Intel Mac Mini. Approval queue, cost tracking with daily limits, escalation chain. Four-tab PWA. Autonomous mode (decompose + dispatch + monitor).

v0.5 — Mobile + Security Hardening

N0D3 mobile worker app (17 Dart files) — full worker contract, Claude Haiku direct, glassmorphism UI, background mode, task handoff on disconnect. M3SHD Commander app (25 Dart files) — five-tab command center. Full red team pentest: all critical findings resolved. SecurityHeadersMiddleware, login rate limiting, git history scrubbed, NSAllowsLocalNetworking enforced. Task handoff system. Text routing broadcast system. Hub stability fixes.

v0.8 — Intelligence Platform

Agent Memory (FTS5, encrypted, auto-extract, auto-inject). Task Dependencies + Pipelines. Task Templates (5 seeds, Commander chips). ntfy Alerting. Live Session Integration (SSE bridge, Claude Code hooks). Real Claude Streaming via anthropic_sdk_dart. Plugin System (4 built-in tools). Self-Evolving Agents (Haiku optimizer, operator approval). CI/CD Pipeline (deploy.sh, pre-commit, GitHub Actions). RBAC (17 permissions, 4 presets). Analytics Dashboard. Webhooks (GitHub HMAC, Uptime Kuma). Multi-User Support. Offline Task Queue.

v1.0 — AGI-Adjacent Layer

MCP Server (17 tools). Voice Command (Siri Shortcuts). 3D WebGL Mesh Visualization. Third-Party API Keys. Demo Mode. Cross-Mesh Federation (hub-to-hub relay, overflow, 3 hops). FCM Push Notifications (Firebase). Android Build Support. Multi-User Support. Second mobile node (iPhone 12 Pro). AGI-Adjacent Layer: Metacognition, Smart Model Routing, NL Mesh Control, Deadlines + Auto-Escalation, Reputation Scoring (UCB), Intrinsic Motivation, Adversarial Self-Improvement, World Model, Debate Protocol, Cryptographic Provenance, Memory Consolidation, Collective Voting, Benchmark Suite. Final security audit: all findings remediated. Production tokens patched with RBAC roles.

Section 33

Future Roadmap

FCM Staging Activation. Firebase service account and FCM env vars need to be deployed to the staging server. Silent push notifications are built and tested locally — one deploy step from activation.
App Store Submission. Both N0D3 and M3SHD Commander are signed and buildable. Required before submission: privacy manifests, App Privacy details in App Store Connect, and review of NSAllowsLocalNetworking justification.
Android Background Execution. WorkManager for background task execution on Android. The FCM integration is in place; WorkManager is the remaining piece for full background operation.
Litestream R2 Backup. Configuration is in place. Requires R2 bucket creation and credentials deployed to staging for point-in-time recovery.
Audit Log Table. An audit_log table recording every mutating operation with timestamp, action, actor, and payload hash. High value for forensic analysis now that all tokens are distinct per-agent.
Caddy-Level Rate Limiting. The current login rate limiter is in-process and resets on hub restart. Caddy-level rate limiting would persist across restarts and handle pre-TLS throttling.
Session Cookie Randomness. The session cookie is currently a deterministic SHA-256 of the auth token. Per-session randomness (HMAC of token + random nonce stored server-side) would prevent offline brute-force if logs are compromised.
Additional Nodes. Any machine with Python 3.12+ and Claude CLI can join by running mesh-worker.py. Any iPhone or iPad can join by installing N0D3. The federation system allows multiple independent M3SHD hubs to relay tasks across organizations.
Natural Language Monitoring. Extend the NL control endpoint to support monitoring queries: "how many tasks has rex completed this week", "which agent has the highest failure rate", "show me all tasks that took longer than 10 minutes".

Section 34

Changelog

Version	Changes
v0.1	Initial release. Seven-table schema. Four-tab PWA. iMessage bridge. Capacity-aware dispatch with circuit breaker. Autonomous mode. Approval queue. Cost tracking. Escalation chain. Two independent security audits completed.
v0.5	N0D3 mobile worker app (17 Dart files). M3SHD Commander app (25 Dart files). Glassmorphism UI. Comprehensive security hardening — all critical findings resolved, git history scrubbed. SecurityHeadersMiddleware, login rate limiting, NSAllowsLocalNetworking enforced. Task handoff system. Text routing broadcast system. Hub stability fixes.
v0.8	Agent Memory (FTS5, encrypted, auto-inject, auto-extract). Task Dependencies + Pipelines. Task Templates (5 seeds). ntfy Alerting. Live Session Integration (SSE bridge, hooks). Real Claude Streaming (anthropic_sdk_dart). Plugin System (4 built-in tools). Self-Evolving Agents. CI/CD Pipeline. Analytics Dashboard. Webhooks (GitHub HMAC, Uptime Kuma). RBAC (17 permissions, 4 presets). Multi-User Support. Offline Task Queue (N0D3). Test fixture isolation across all test files.
v1.0	MCP Server (17 tools). Voice Command (Siri Shortcuts). 3D WebGL Mesh Visualization. Third-Party API Keys. Demo Mode. Cross-Mesh Federation (hub-to-hub relay, overflow, 3 hops). FCM Push Notifications (Firebase). Android Build Support. Second mobile node (iPhone 12 Pro). AGI-Adjacent Layer: Metacognition, Smart Model Routing, NL Mesh Control, Deadlines + Auto-Escalation, Reputation Scoring (UCB), Intrinsic Motivation, Adversarial Self-Improvement, World Model, Debate Protocol, Cryptographic Provenance, Memory Consolidation, Collective Voting, Benchmark Suite. Final security audit — all findings remediated. Production tokens patched with RBAC roles.