Technical Report

M3SHD

Multi-Agent AI Mesh — Full System Documentation

Author: Fred Wojo AI Architect: Archon (Claude) System: mesh.demobygrit.com Deployment: Staging VPS
6 Active Nodes
18 DB Tables
80+ API Endpoints
17 RBAC Permissions
3 Security Audits
Section 01

Executive Summary

Abstract

M3SHD is a multi-agent collaboration mesh that lets a single human operator command a distributed team of AI agents from anywhere — iMessage, a PWA dashboard, a native iOS app, or a desktop browser. The system spans six heterogeneous nodes across desktop, mobile, and cloud, coordinated by a lightweight hub that routes tasks without running AI inference itself. All intelligence lives at the edges.

The system spans six nodes: Archon (M2 Mac Mini, primary AI and iMessage bridge), Rex (Intel Mac Mini, research), Crucible (iMac 27", overflow compute), N0D3-1 (iPhone 14 Pro Max, mobile AI worker), N0D3-2 (iPhone 12 Pro, second mobile worker), and the M3SHD Commander node (dispatch and monitoring command center). Desktop nodes connect over Tailscale; mobile nodes connect via the public HTTPS endpoint.

The platform grew through eight iterative phases, each building on a tested foundation before proceeding. The codebase now spans 20+ test files and covers security, intelligence, observability, mobile, and cross-hub federation.

Intelligence Layer

Agents maintain persistent encrypted memory: they remember facts across tasks, auto-extract key findings tagged [REMEMBER], and have those memories automatically injected into future task prompts. Tasks can be chained into pipelines with dependency graphs, so the output of a research task flows automatically into a summarization task when the first completes. Task templates allow one-tap dispatch of recurring workflows with variable substitution. A plugin system exposes tools (web search, file summary, notification, memory enhancement) to agents via a structured lifecycle. Self-evolving agents track their own performance and submit prompt amendment proposals for operator review.

AGI-Adjacent Capabilities

The AGI-adjacent intelligence layer adds capabilities uncommon in production agent systems at this scale: metacognition (confidence scoring with auto-verification on low-confidence outputs), smart model routing (Haiku/Sonnet/Opus selected per task complexity), natural language mesh control, agent reputation scoring via UCB formula, intrinsic motivation for idle agents, adversarial self-improvement, a shared world model (entity graph auto-extracted from task outputs), the debate protocol (two agents argue, synthesis merges the strongest positions), cryptographic provenance (HMAC-SHA256 signed task chains), memory consolidation, and collective voting on ambiguous decisions.

Platform Capabilities

The platform provides RBAC (17 permissions, 4 role presets), multi-user support (admin/user/viewer with PBKDF2 sessions), third-party API keys with rate limits and daily caps, cross-mesh federation (hub-to-hub relay with overflow routing and loop prevention), FCM push notifications (Firebase HTTP v1, silent wakeup on task dispatch), a demo mode (public read-only dashboard), an MCP server (17 tools, wired into Claude Desktop and Claude Code natively), voice command dispatch (Siri Shortcuts webhook), a 3D WebGL mesh visualization, and Android build support for both apps.

Security posture: Multiple independent security audits have been conducted across the full codebase, both mobile apps, and git history. All findings have been remediated and independently verified. The system holds a clean security bill of health at current deployment.
Section 02

Architecture Overview

M3SHD follows a hub-and-spoke topology extended with peer federation. The hub is a stateless relay and persistence layer that does not run AI inference. All intelligence lives at the edges: worker processes on individual machines invoke Claude CLI as a subprocess and report results back to the hub via REST API, while mobile workers invoke the Claude API directly. Desktop workers additionally have access to the full plugin system, memory store, and intelligence layer. This design means the hub can run on a modest VPS (256MB RAM, 0.5 CPU) while workers leverage the full compute of their host machines.

The hub exposes over 80 API endpoints over HTTPS, authenticated by RBAC-scoped agent tokens or session cookies. Real-time event delivery uses Server-Sent Events (SSE), which avoids WebSocket complexity while providing push notifications for all system events. The SSE bus implements backpressure via per-subscriber queue caps of 500 events and sends a keepalive every 30 seconds.

Workers connect to the hub over Tailscale (desktop nodes) or the public HTTPS endpoint (mobile nodes and federated peer hubs). The Claude CLI on desktop workers runs with --dangerously-skip-permissions; the bearer token and Tailscale network boundary are the security perimeter. Mobile workers are API-only — no shell, no filesystem, no tool use.

                    ┌─────────────────────────────────────────────┐
                    │              M3SHD Commander Hub             │
                    │          mesh.demobygrit.com                 │
                    │                                              │
                    │  FastAPI + SSE + SQLite WAL                  │
                    │  Task Queue + Agent Registry                  │
                    │  RBAC + Multi-User Auth                       │
                    │  Agent Memory (FTS5, encrypted)              │
                    │  Task Dependencies + Pipelines               │
                    │  Plugin System (4 built-in tools)            │
                    │  Self-Evolving Agents                        │
                    │  AGI Intelligence Layer                      │
                    │  Analytics Dashboard                         │
                    │  Webhook System + Federation                 │
                    │  MCP Server (17 tools)                       │
                    │  3D WebGL Mesh Visualization                 │
                    └──────────────────┬──────────────────────────┘
                                       │
          ┌────────────────────────────┼────────────────────────────┐
          │                            │                            │
   ┌──────┴──────┐              ┌──────┴──────┐            ┌────────┴────────┐
   │ Mesh Daemon │              │ Rex Worker  │            │ Crucible Worker │
   │ (M2 Mac)    │              │ (Intel Mac) │            │ (iMac 27")      │
   │             │              │             │            │                 │
   │ Archon AI   │              │ Rex AI      │            │ Worker AI       │
   │ iMsg Bridge │              │ Research    │            │ Overflow        │
   │ Full Tools  │              │ File Ops    │            │ Research        │
   │ 5 slots     │              │ 2 slots     │            │ 3 slots         │
   └─────────────┘              └─────────────┘            └─────────────────┘

   N0D3-1 / N0D3-2 (iPhones) ──── Claude API ──────────► Hub
   M3SHD Commander App ──────────── dispatch/monitor ───► Hub
   Peer Hubs ────────────────────── federation relay ───► Hub

The mesh daemon on the M2 Mac Mini is the critical integration point. It runs four persistent threads plus a session bridge daemon: T1 reads iMessage chat.db and relays texts to the hub; T2 polls the hub for agent messages and delivers them via osascript; T3 is the Archon AI watcher that responds using Claude; T4 is a Rex agent thread; and the session bridge daemon listens to SSE and writes broadcast files that Claude Code hooks can read between task polls.

Section 03

Hub Server

The hub is implemented as a FastAPI application in app/main.py. It serves authentication, API routing, SSE broadcasting, the web UI, and the admin control plane.

Authentication

The hub supports three parallel auth mechanisms: (1) RBAC agent tokens (m3shd_ag_ prefix) with scoped permissions per the RBAC system, (2) third-party API keys (m3shd_key_ prefix) with per-key rate limits and daily caps, and (3) browser sessions via PBKDF2-hashed user accounts. The master token bypasses RBAC (all permissions). All token comparisons use hmac.compare_digest.

Security Headers

SecurityHeadersMiddleware injects on every response: Content-Security-Policy, Strict-Transport-Security (max-age 31536000; includeSubDomains), X-Frame-Options: DENY, X-Content-Type-Options: nosniff, Referrer-Policy: strict-origin-when-cross-origin, and Permissions-Policy: camera=(), microphone=(), geolocation=(). Login rate limiting is enforced at /api/login: 5 failures per 5-minute window per IP returns 429.

Background Sweepers

Three background tasks run on a schedule: (1) the agent sweeper marks workers offline if their heartbeat exceeds the configurable timeout, (2) the zombie task sweeper resets tasks stuck in running or assigned for more than 10 minutes back to queued, and (3) the agent evolution sweeper runs every 6 hours to analyze per-agent performance and generate prompt amendment proposals. A task deadline sweeper runs every 60 seconds to check for overdue tasks, increment their urgency, and fire ntfy alerts.

SSE Bus

The SSE bus broadcasts across 30+ event types covering all system activity. Keepalive events fire every 30 seconds. Per-subscriber queue caps at 500 events enforce backpressure. Plugin lifecycle hooks run on task_created, task_completed, task_failed, agent_online, and agent_offline events.

Section 04

Worker System

The generic worker (mesh-worker.py) is the mechanism by which any machine joins the mesh with a single command. It uses only urllib.request for HTTP and threading for concurrency — no FastAPI dependency.

The worker lifecycle: register with capabilities, enter a poll loop with heartbeats, check active task count and system load average before fetching tasks, spawn a daemon thread per task, invoke Claude CLI via subprocess, stream partial output to the hub via POST /api/tasks/{id}/stream, post final output and cost log, and report done.

result = subprocess.run(
    ["claude", "--print", "--dangerously-skip-permissions"],
    input=prompt,
    capture_output=True,
    text=True,
    timeout=self.claude_timeout,
)
  • Memory integration. After task completion, the worker scans output for [REMEMBER] blocks and POSTs them to /api/agents/{id}/memory. On task pickup, relevant memories are already injected into the prompt by the hub.
  • Plugin tool calls. Workers scan output for [TOOL_CALL] ... [/TOOL_CALL] markers and invoke the corresponding plugin tool, incorporating the result into the next Claude invocation.
  • Capability routing. The required_capability column on tasks ensures a task dispatched for code_review only goes to agents that advertise that capability.
  • Escalation. After three consecutive failures, the worker triggers an escalation. Rex escalates to Archon, Archon escalates to the user. Daily token budgets cap spending at 500,000 tokens per agent.
Section 05

Mesh Daemon

The mesh daemon (mesh-daemon.py) runs on the M2 Mac Mini and integrates four subsystems: the iMessage relay, the hub-to-iMessage relay, the Archon AI watcher, and the Rex agent. All four threads share a shutdown_event and are monitored by a health supervisor.

Session Bridge

A standalone session-bridge.py daemon runs alongside the mesh daemon. It maintains a persistent SSE connection to the hub, and whenever a broadcast-worthy event arrives (new human message, agent output, system alert), it writes a timestamped file to ~/m3shdup/broadcasts/. Active Claude Code sessions check this directory between task polls via the broadcast-check.sh UserPromptSubmit hook. The mesh-post.sh script allows posting from any terminal session to hub chat.

Resilience

The resilient_loop wrapper restarts a crashed thread after 5 seconds of backoff, up to 10 times. After all threads exhaust their budget, the main health supervisor exits and the outer start-mesh.sh shell loop relaunches the entire daemon in 5 seconds. Individual crash recovery: 5 seconds. Full process recovery: 10 seconds. The daemon auto-starts via .zprofile on terminal open. A lockfile prevents duplicate instances.

Section 06

iMessage Bridge

The iMessage bridge reads Apple's chat.db in read-only mode and sends replies via AppleScript's osascript. It is the most unconventional component in the system.

  • Thread 1 (iMessage to Hub) polls chat.db every 3 seconds, filtering for incoming messages from the user's number. Bridge-echo loops are prevented by checking for the relay prefix regex. Image attachments undergo path restriction, symlink resolution, 20MB cap, and magic byte verification (JPEG, PNG, GIF, WebP, HEIC/HEIF). Failed POSTs are queued in a deque and retried on the next cycle.
  • Thread 2 (Hub to iMessage) polls the hub every 5 seconds for new agent messages and delivers them via osascript. The message content is always passed via argv — never interpolated into AppleScript — preventing injection. Messages over 3,000 characters are truncated at a word boundary.
  • Schema version detection. The bridge includes an iMessage schema abstraction layer that detects the chat.db schema version at startup and normalizes reads accordingly, handling schema differences across macOS versions without requiring code changes.
Section 07

Command Interface

The user communicates with the mesh by texting commands from an iPhone. The command parser recognizes 14 core command patterns:

CommandModeDescription
statusstatusShow agent online/offline states and active task counts
taskstasksList the 10 most recent tasks with status icons
@contextcontext_getShow all shared context key-value pairs
@context key=valcontext_setSet a shared context entry
@rex <text>rex_messageDispatch a task directly to Rex
approveapprovalApprove the most recent pending approval
rejectapprovalReject the most recent pending approval
pendingpending_approvalsList all pending approval requests
costs / @costscostsShow today's token usage and cost by agent
escalationsescalationsList open escalations
do <text>taskExecute a task via Archon with full context
auto <goal>autonomousDecompose goal into subtasks, dispatch to Rex
summarysummarySummarize recent task results
(anything else)chatFree-form conversation with Archon

Autonomous mode decomposes a high-level goal into 2–5 concrete subtasks via Claude, dispatches each to Rex in parallel, and delivers a summary with a tasks link for tracking. Voice commands via Siri Shortcuts provide an additional dispatch path — POST /api/voice/dispatch parses natural language and routes to the matching task template.

Section 08

Task Dispatch Engine

The dispatcher (app/dispatch.py) implements capacity-aware routing with circuit-breaker protection, extended with reputation-based selection and deadline urgency.

Core Algorithm
  • If assign_to is specified, verify capacity and closed circuit. If not, fall through.
  • Query available workers for all online/busy agents with spare capacity advertising the required capability.
  • Filter out agents with open circuit breakers.
  • Apply reputation scoring (UCB formula across per-capability success histories).
  • Select highest-reputation agent with available capacity.
  • If no worker is available, create task with status=queued.
Circuit Breaker

Trips after 3 consecutive failures. Cooldown: 60 seconds. Half-open state on recovery: one task allowed through; success closes the circuit, failure trips it again immediately.

Reputation Scoring

Each agent maintains per-capability success/failure history. The UCB (Upper Confidence Bound) formula balances exploitation (agents with high success rates) with exploration (agents that haven't been tried recently for a given capability). New agents begin with a neutral prior.

Deadlines

Tasks created with a deadline timestamp are monitored by the 60-second sweeper. Overdue tasks have their urgency level incremented (low → medium → high → critical) and trigger ntfy escalation alerts. Priority is bumped automatically on urgency escalation.

Section 09

Safety and Control Layer

M3SHD provides five interlocking safety mechanisms: the approval queue, the escalation chain, cost tracking, RBAC, and metacognition.

  • Approvals. Any agent can create an approval request via POST /api/approvals. The user responds with approve or reject. Expired approvals are automatically cleaned up.
  • Escalations. Three consecutive task failures trigger an escalation. Rex escalates to Archon, Archon escalates to the user. Each escalation record includes agent, target, task ID, and reason.
  • Cost Tracking. Every Claude invocation logs estimated token usage. Daily limit: 500,000 tokens per agent. The budget ntfy alert fires at $1/day per agent.
  • ntfy Alerting. Four alert types: agent-down (once per outage), task-failed (on third attempt), escalation (on escalation creation), and budget ($1/day threshold). Fire-and-forget via run_in_executor — never blocks the request path.
  • Metacognition. Every task result carries a confidence score (0.0–1.0). Tasks with confidence below 0.7 are automatically submitted for verification by a second agent before the result is accepted. Operators can see confidence scores in the analytics dashboard.
Section 10

Web UI

The web UI is a dark-themed, mobile-first PWA built without any JavaScript framework. CSS custom properties handle theming; JetBrains Mono is the primary typeface; the brand gradient is amber → purple → emerald.

  • Dashboard. Agent cards showing name, machine, online/offline status, and task utilization. Status summary bar shows total online agents, active tasks, and pending approvals.
  • Chat. Full-screen chat interface with color-coded message bubbles: amber (user), purple (Archon), emerald (Rex). Real-time SSE updates. 16px input font prevents iOS Safari zoom-on-focus.
  • Tasks. Kanban-style task list. Template chips allow one-tap dispatch of seed templates. Each task card shows title, assignee, status badge, confidence score, and deadline indicator if set.
  • Logs. Live log stream fed by SSE. Timestamped and color-coded by event type.
  • 3D Mesh Visualization. WebGL force graph. Nodes colored by status (green=online, amber=busy, red=offline) and sized by capacity. Active tasks render as glowing particles traveling between nodes. The graph auto-rotates and is interactive.
  • Analytics. Admin-only tab. Charts for task throughput by status, agent, and day; cost trends; uptime and memory counts per agent.
  • Demo Mode. When enabled, GET /demo serves a public read-only dashboard. Rate limited at 30 req/min per IP.
Section 11

N0D3 Mobile Worker

N0D3 is a Flutter iOS app that turns any iPhone or iPad into a live M3SHD worker node. The second instance, N0D3-2, runs on an iPhone 12 Pro.

Architecture

The app implements the full worker contract: register, heartbeat, poll, execute, report. It calls the Claude API directly (api.anthropic.com) using the user's API key stored in iOS Keychain via flutter_secure_storage. State management via Flutter Riverpod (Notifier / AsyncNotifier). GoRouter navigation with three routes: splash, setup, and main.

Real Claude Streaming

N0D3 uses anthropic_sdk_dart for real SSE streaming via client.messages.createStream(). The onChunk callback forwards partial output to the hub in real-time as the model generates, delivering genuine token-by-token updates to the dashboard and Commander app.

  • Offline Task Queue. If the hub is unreachable when a task completes, the result is saved locally. On reconnect, a sync sweep POSTs all pending results.
  • Capabilities. research, summarize, chat, triage. Max 1 concurrent task. Text-in, text-out only.
  • Background Mode. FCM silent push (Firebase Cloud Messaging) wakes the app the moment a matching task is dispatched, reducing task start latency from minutes to seconds.
  • Lifecycle-Aware Heartbeats. 10s (WiFi, foreground), 30s (backgrounded), 60s (cellular).
  • Task Handoff. On 30 seconds of failed reconnection during active task execution: save partial output locally, reconnect, call POST /api/tasks/{id}/handoff. Hub suspends original task and creates continuation for the next available agent.
  • UI Design. Glassmorphism: frosted glass cards via BackdropFilter, gradient borders, live status indicator. All colors via MeshColors and MeshGradient in theme.dart.
ConfigValue
Bundle IDcom.gritwerk.meshNode
Minimum iOS15.0
Display NameN0D3
NetworkNSAllowsLocalNetworking: true
Android APK49MB
Section 12

M3SHD Commander App

The M3SHD Commander is a five-tab native iOS command center (25 Dart files). It registers with maxConcurrent: 0 — the dispatcher never assigns it tasks to execute.

TabPathScreen
0/Dashboard — agent grid, online status, utilization
1/chatChat — full mesh chat, keyboard padding fixed
2/tasksTasks — create tasks, template chips, view queue; FAB above tab bar
3/logsLogs — filtered SSE log stream
4/settingsSettings — hub URL, token, commander name
State Providers
  • settingsProviderNotifierProvider<SettingsNotifier, AppSettings>
  • hubTokenProviderFutureProvider<String> (iOS Keychain)
  • hubConnectionProviderNotifierProvider (heartbeat + SSE lifecycle)
  • agentsProviderAsyncNotifierProvider (fetches immediately on connect)
  • messagesProvider and tasksProvider — real-time via SSE

SSE integration reconnects with exponential backoff. The hub connection provider tears down the SSE stream on device-offline and reconnects on return. All colors from MeshColors.*. Touch targets: 44px minimum throughout.

ConfigValue
Bundle IDcom.gritwerk.m3shdup
Minimum iOS15.0
Background modesfetch, remote-notification
NetworkNSAllowsLocalNetworking: true
Section 13

Task Handoff System

The task handoff system ensures no work is lost when an agent disconnects mid-task.

Endpoint. POST /api/tasks/{id}/handoff accepts optional partial_output. The endpoint: loads the original task; updates status to suspended, storing partial output in the output field (capped at 32KB, with ownership check); creates a continuation task at bumped priority with a prompt prepended by CONTINUE FROM PREVIOUS AGENT'S PARTIAL WORK:; dispatches via the standard dispatcher; and broadcasts a task_handoff SSE event.

N0D3 triggers handoff automatically on 30 seconds of failed reconnection. Desktop workers can call it explicitly when approaching token budget limits. The required_capability of the original task is preserved in the handoff task so the continuation lands on a capable agent.

Section 14

Agent Memory System

The agent memory system gives agents persistent, searchable, encrypted memory across tasks.

Storage

agent_memory table with UNIQUE(agent_id, key). Values are encrypted at rest using the hub's secret key. An FTS5 virtual table (agent_memory_fts) stores plaintext copies for full-text search.

API
  • POST /api/agents/{id}/memory — store or update a memory entry
  • GET /api/agents/{id}/memory — list all memories for an agent
  • GET /api/agents/{id}/memory/search?q=<query> — FTS5 full-text search
  • DELETE /api/agents/{id}/memory/{key} — delete a specific entry
Auto-Extract & Auto-Inject

Workers scan task output for [REMEMBER] key: value [/REMEMBER] blocks and POST them automatically. When the hub creates a task prompt, get_memory_context(agent_id, task_text) performs an FTS5 search against the task text and prepends top-K matching memories. Agents thus "remember" relevant prior facts without explicit operator configuration.

Memory Consolidation

A "sleep" function sweeps all agent memories, merges duplicates, resolves contradictions (later entry wins unless confidence scores differ), and discovers cross-agent patterns. Results are written back as consolidated memory entries tagged with source: consolidation. FTS5 query strings are sanitized to alphanumeric words before reaching the FTS engine, preventing malformed syntax from crashing the shared database connection.

Section 15

Task Dependencies and Pipelines

Tasks can declare dependencies on other tasks, forming execution DAGs.

Schema. task_deps junction table (parent_task_id, child_task_id). Circular dependency prevention uses a recursive CTE that walks the ancestor chain before insertion. Tasks created with depends_on: [id1, id2] start in queued state regardless of dispatcher availability.

Auto-dispatch. When a task reaches done, check_and_dispatch_dependents() runs. It finds all child tasks whose parents are all in done state, injects parent output into the child prompt, and dispatches. This chains arbitrarily deep without operator involvement.

Pipelines. POST /api/pipelines accepts a list of task definitions and wires them sequentially: task N's completion dispatches task N+1 with N's output injected. Use cases: research → summarize → notify; code_write → code_review → deploy.

Section 16

Task Templates

Task templates allow one-tap dispatch of recurring workflows with variable substitution.

Schema. task_templates table with fields: id, name, description, prompt_template (with {variable} placeholders), capability, priority, created_by.

Seed templates: Research ({topic}), Summarize ({target}), QA ({target}), Code Review ({target}), Write ({topic}).

  • POST /api/templates — create a template
  • GET /api/templates — list all templates
  • DELETE /api/templates/{id} — remove a template
  • POST /api/templates/{id}/dispatch — dispatch with variable substitution

The Commander app shows a horizontal template chip row on the Tasks screen. Tapping a chip opens a bottom sheet for variable input, then dispatches. The voice dispatch endpoint also matches natural language to the most appropriate template.

Section 17

Plugin System

The plugin system allows extending agent capabilities with structured tools callable during task execution.

Architecture. app/plugins.py implements PluginManager with three registries: tool functions, lifecycle hooks, and capability declarations. Plugin files in the plugins/ directory expose a setup(manager) function that registers with the manager on hub startup.

Built-in Plugins
  • web_search — searches the web and returns structured results
  • file_summary — summarizes a file at a given path (hub-side, path-sanitized)
  • notify — fires an ntfy push notification from within a task
  • memory_enhance — performs an FTS5 search against agent memories and returns matches

Tool invocation. Workers scan task output for [TOOL_CALL] {"tool": "web_search", "query": "..."} [/TOOL_CALL] markers, POST to /api/plugins/{tool}/invoke, and incorporate the result into the next Claude invocation. Plugin responses strip filesystem paths from the output.

Section 18

Self-Evolving Agents

Agents track their own performance and can iteratively improve their system prompts.

Performance Tracking

app/evolution.py accumulates per-agent, per-capability success/failure statistics across a configurable rolling window (default: 7 days). Each task completion updates the agent's performance record.

Prompt Optimizer

Every 6 hours, the evolution sweeper analyzes each agent's performance patterns. For agents with meaningful data, it calls Claude Haiku with the agent's current guidelines and performance summary, asking for typed amendments: add (new guidance for failure modes), reinforce (strengthen guidance that correlates with success), restrict (narrow scope of problematic patterns), or remove (delete guidance that correlates with failure).

Operator Approval

Proposed amendments appear in the hub dashboard with confidence scores. The operator reviews and applies via POST /api/agents/{id}/evolve with the amendment ID. Applied amendments are prepended to the agent's system prompt in an EVOLUTION GUIDELINES: block.

Section 19

AGI-Adjacent Intelligence Layer

The intelligence layer adds a stack of capabilities uncommon in production agent systems at this scale. These features are active on the staging deployment with the Anthropic API key in place.

Metacognition

Every task result includes a structured confidence score (0.0–1.0) computed by Claude during response generation. Tasks below 0.7 confidence are automatically submitted for verification: a second agent re-evaluates independently, and the higher-confidence result is accepted. Operators see confidence in task detail views and analytics.

Smart Model Routing

Heuristic analysis of task text selects Claude Haiku (simple/short), Sonnet (standard), or Opus (complex/critical) at dispatch time. Factors include: task length, keyword signals, capability type, and current agent load. Operators can override per-task.

Natural Language Mesh Control

POST /api/mesh/control accepts free-form natural language instructions ("move rex to code-review only", "take crucible offline for maintenance", "set archon max concurrent to 3"). Claude Haiku interprets the intent and emits a structured operation that the hub executes.

Agent Reputation Scoring

UCB (Upper Confidence Bound) formula applied to per-capability success/attempt histories. Dispatch selects the highest-UCB agent for each capability, balancing exploitation of known-good agents with exploration of underutilized ones. New agents receive a neutral prior.

Intrinsic Motivation

Idle agents proactively select from a pool of autonomous task types: audit memory for stale entries, verify recent task outputs, refresh cached research, check system health. Cooldown periods per task type prevent repeated invocations, keeping idle capacity productive.

Adversarial Self-Improvement

After a task completes, a challenger agent is given the original prompt and the primary agent's output. The challenger is instructed to find weaknesses, errors, or gaps. If the challenger produces meaningful critique, the primary agent performs a revision loop incorporating the feedback. The final output includes both the revision and the challenger's critique score.

World Model (Entity Graph)

Shared entities and entity_relations tables. Task outputs are processed by an extraction agent that identifies named entities (people, projects, organizations, tools, concepts) and relations between them. Agents can query the world model before executing tasks to ground their responses.

Debate Protocol

For high-stakes tasks (flagged requires_debate: true), two agents receive the same prompt and generate independent responses. A synthesis agent receives both responses and produces a merged output incorporating the strongest points from each position. Operators can view the debate thread in the task detail view.

Cryptographic Provenance

Every task result is signed with HMAC-SHA256: sign(prev_hash || timestamp || agent_id || task_id || output_hash). The chain starts from a genesis hash (64 zeros). Any tampered output breaks the chain, detectable by verifying the hash sequence. GET /api/tasks/{id}/provenance returns the full chain.

Collective Voting

When a task result is flagged requires_vote: true, the hub polls all available agents for a structured binary vote (accept/reject) with rationale. After a configurable quorum (default: 3 votes), majority rules. Tied votes escalate to the operator. The jury roster and individual votes are stored for audit.

Benchmark Suite

POST /api/admin/benchmark runs six performance tests sequentially: task dispatch throughput, concurrent task handling, FTS5 memory search latency, SSE event fan-out, federation relay latency, and end-to-end task completion time. Results are saved to the data directory for trend analysis.

Section 20

RBAC and Multi-User Support

RBAC — 17 Permissions
Permission GroupPermissions
Taskstasks:read, tasks:write, tasks:stream
Agentsagents:read, agents:write, agents:admin
Messagesmessages:read, messages:write
Memorymemory:read, memory:write
Adminadmin:analytics, admin:webhooks, admin:users, admin:tokens, admin:plugins, admin:federation
Voicevoice:dispatch
4 Role Presets
RolePermissions
workertasks:read/write/stream, agents:read, messages:read/write, memory:read/write, voice:dispatch (10)
mobileSame as worker minus voice:dispatch (9)
commanderAll worker permissions + agents:write, admin:analytics (12)
adminAll 17 permissions
Security note: Empty permissions on an agent token means no access — not full access. This was remediated during the security audit cycle. The master token bypasses RBAC entirely and is reserved for administrative operations.
Multi-User Support

users table with PBKDF2-HMAC-SHA256 password hashing (310,000 iterations). Session tokens have a 24-hour TTL. Three user roles: admin, user, viewer. Dual-path login: username + password or master token. A seed admin user is created on first run.

Third-Party API Keys

api_keys table with m3shd_key_ prefix. Per-key rate limits (requests/minute) and daily caps (requests/day). Usage tracking per key. Separate auth path from agent tokens. Admin CRUD via /api/admin/keys.

Section 21

Webhooks and Federation

Webhooks

webhooks table with encrypted secrets. Endpoints: create, list, delete, generic trigger (secret validation via X-Webhook-Secret header only — query params rejected), and GitHub-specific trigger (HMAC-SHA256 signature validation on X-Hub-Signature-256). Rate limiting: 10 requests/minute per webhook. Uptime Kuma auto-detected from X-Uptime-Kuma-Agent header.

Cross-Mesh Federation

peer_hubs table. app/federation.py implements hub-to-hub task relay. When local capacity is exhausted for a required capability, the dispatcher queries peer hubs for availability and relays the task. The relay sweeper polls peers for completed results and applies them to the local task record. Relay hop count tracked in task metadata; max 3 hops prevents relay loops.

  • POST /api/admin/peers — register a peer hub (URL, auth token)
  • GET /api/admin/peers — list peers
  • DELETE /api/admin/peers/{id} — remove peer
  • POST /api/federation/tasks — incoming relay endpoint
Section 22

Analytics and Observability

Four analytics endpoints, all scoped by days query parameter (default: 7):

  • GET /api/admin/analytics/tasks — task counts by status, by agent, by day, by capability
  • GET /api/admin/analytics/costs — cost by agent, by day, average cost per task per agent
  • GET /api/admin/analytics/agents — uptime percentage, total tasks, memory count per agent
  • GET /api/admin/analytics/summary — combined quick-view for dashboard header
ntfy Alerting
AlertTriggerDe-duplication
Agent downHeartbeat timeoutOnce per outage; suppressed until agent recovers
Task failed3rd consecutive attemptPer task ID
EscalationEscalation creationPer escalation ID
Budget$1/day thresholdOnce per UTC day per agent

All alerts fire via asyncio.run_in_executor — fire-and-forget, never blocking the request path. The broadcast-check.sh UserPromptSubmit hook wires the M3SHD message bus into every active Claude Code session on any mesh node.

Section 23

CI/CD Pipeline

deploy.sh
./deploy.sh [--dry-run]
# 1. pytest gate — fail on any test failure
# 2. rsync source to staging VPS (excluding data/, .env)
# 3. docker compose up -d --build m3shdup
# 4. health check (/api/health) — fail and alert if unhealthy
  • Pre-commit hook. .githooks/pre-commit runs pytest before every commit. Failed tests block the commit. setup-hooks.sh installs the hook with one command.
  • GitHub Actions. .github/workflows/test.yml runs the full test suite on every push and pull request. Matrix: Python 3.12. Steps: checkout, install dependencies, run pytest with coverage report.
Section 24

Database Schema

The hub uses SQLite in WAL mode at data/m3shdup.db. The schema spans 18 tables.

TablePurposeKey Columns
messagesChat historyid, ts, sender, sender_type, content, channel, reply_to
agentsWorker registryid, name, machine, status, capabilities, max_concurrent, active_tasks
tasksTask queueid, title, prompt, assigned_to, status, priority, output, attempts, required_capability, deadline, urgency, confidence, provenance_hash
task_depsDependency graphparent_task_id, child_task_id
task_templatesReusable workflowsid, name, description, prompt_template, capability, priority
approvalsPermission requestsid, agent, action, description, status, decided_by, timeout
contextShared KV storekey (PK), value, set_by
cost_logToken usageagent, task_id, input_tokens, output_tokens, model, cost_usd
escalationsFailure chainfrom_agent, to_agent, task_id, reason, status
agent_memoryPersistent agent memory (encrypted)agent_id, key, value, created_at, updated_at
agent_memory_ftsFTS5 full-text indexVirtual table over plaintext memory
agent_tokensRBAC-scoped auth tokenshash, agent_id, permissions (JSON), created_at
api_keysThird-party accesshash, name, rate_limit, daily_cap, usage_today
usersMulti-user accountsid, username, password_hash, role, created_at
webhooksInbound webhook definitionsid, name, url, secret (encrypted), event_filter
peer_hubsFederated mesh peersid, url, token (encrypted), status, relay_count
entitiesWorld model nodesid, name, type, description, source_task_id
entity_relationsWorld model edgesfrom_entity_id, to_entity_id, relation_type, weight

Status enums are CHECK-constrained. Foreign keys enabled via PRAGMA foreign_keys=ON. WAL mode configured at startup. All schema changes applied via migration guards (PRAGMA table_info check before ALTER TABLE).

Section 25

API Reference

All endpoints except /api/health and /api/login require Authorization: Bearer <token> or m3sh_session cookie. Admin endpoints additionally require RBAC admin:* permissions.

Core Endpoints
MethodPathDescription
POST/api/loginAuthenticate, set session cookie (rate-limited: 5/5min per IP)
GET/api/streamSSE event stream (30s keepalive)
POST/api/messagesSend a message (50KB max)
POST/api/tasksCreate and dispatch task
PUT/api/tasks/{id}Update task status/output
POST/api/tasks/{id}/streamAppend to task stream log
POST/api/tasks/{id}/handoffSuspend + create continuation
POST/api/agents/{id}/registerSelf-register worker
POST/api/agents/{id}/heartbeatKeep-alive ping
GET/PUT/DELETE/api/contextShared KV store
POST/GET/PUT/api/approvalsApproval queue
POST/GET/PUT/api/escalationsEscalation chain
GET/api/healthHealth check (unauthenticated)
Extended Endpoints
MethodPathDescription
POST/GET/DELETE/api/agents/{id}/memoryAgent memory CRUD
GET/api/agents/{id}/memory/searchFTS5 memory search
POST/api/agents/{id}/evolveApply evolution amendment
POST/api/pipelinesCreate chained task pipeline
POST/GET/DELETE/api/templatesTask template CRUD
POST/api/templates/{id}/dispatchDispatch template with vars
POST/api/plugins/{tool}/invokeInvoke plugin tool
POST/api/voice/dispatchVoice/NL task dispatch
POST/api/mesh/controlNatural language mesh control
GET/api/tasks/{id}/provenanceTask provenance chain
GET/POST/PUT/DELETE/api/admin/usersUser management (admin)
GET/POST/DELETE/api/admin/webhooksWebhook CRUD (admin)
GET/POST/DELETE/api/admin/peersFederation peer management (admin)
POST/api/federation/tasksIncoming federated task
GET/POST/DELETE/api/admin/keysThird-party API key management (admin)
GET/api/admin/analytics/tasksTask analytics (admin)
GET/api/admin/analytics/costsCost analytics (admin)
GET/api/admin/analytics/agentsAgent analytics (admin)
POST/api/admin/benchmarkRun benchmark suite (admin)
GET/demoPublic demo dashboard

SSE events cover 30+ types: all core types plus task_handoff, memory_stored, evolution_proposed, plugin_invoked, federation_relay, vote_called, vote_result, debate_started, debate_result, world_model_updated, provenance_verified, benchmark_complete, and others.

Section 26

Security Posture

M3SHD has undergone multiple independent security audits across the full codebase, both mobile apps, and version history. All findings have been remediated and verified. The system currently holds a clean security bill of health.

Security Controls
  • All secrets via environment variables — no secrets in source code
  • RBAC on all API endpoints: 17 permissions, 4 role presets
  • Empty permission array on agent token = no access
  • Bearer token comparison via hmac.compare_digest (constant-time)
  • PBKDF2-HMAC-SHA256 for user passwords (310,000 iterations)
  • Session tokens 24-hour TTL; master-token-only for new admin token creation
  • SecurityHeadersMiddleware: CSP, HSTS (1 year + includeSubDomains), X-Frame-Options DENY, nosniff, Referrer-Policy, Permissions-Policy
  • Login rate limiting: 5 failures per 5-minute window per IP → 429
  • Input length limits: messages 50KB, prompts 8KB, osascript 3KB, handoff partial_output 32KB, stream chunks 64KB
  • Task handoff IDOR: ownership check enforced before allowing handoff
  • Task stream IDOR: ownership check enforced before accepting output injection
  • FTS5 sanitization: query strings sanitized to alphanumeric words before FTS engine
  • Plugin path disclosure: filesystem paths stripped from plugin error responses
  • Webhook secrets: X-Webhook-Secret header only; query param returns 400
  • GitHub HMAC: X-Hub-Signature-256 HMAC-SHA256 validation
  • osascript injection prevention: user content via argv, never interpolated
  • Image attachment validation: path restricted, symlinks resolved, magic bytes verified, 20MB cap
  • Agent ID sanitization: [a-z0-9-] regex enforced at registration
  • Container runs as non-root user (appuser)
  • SSE queue capped at 500 events per subscriber; query limits capped at 500
  • SSH key-only authentication on all mesh nodes
  • NSAllowsLocalNetworking: true (not NSAllowsArbitraryLoads) on iOS apps
Audit outcome: Multiple independent red team audits conducted across the full codebase, both mobile apps, and git history. All findings from all audit cycles have been remediated and independently verified. Production tokens are patched with explicit RBAC permissions.
Section 27

Container and Deployment

The hub runs as a single Docker container on the staging VPS at mesh.demobygrit.com.

FROM python:3.12.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
RUN adduser --disabled-password --no-create-home appuser
COPY . .
RUN chown -R appuser:appuser /app
USER appuser
EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
Container Configuration
  • Port mapping: 8333 (host) to 8000 (container)
  • Volume: ./data:/app/data (SQLite + benchmark reports + plugin output)
  • Memory limit: 256MB, CPU limit: 0.5 cores, reservation: 64MB
  • Restart policy: unless-stopped
  • Log rotation: 10MB max, 3 files retained
  • Healthcheck: urllib /api/health every 30 seconds
Backup & Push

Litestream is configured for R2 replication — requires R2 bucket creation and credentials deployed to staging. Firebase service account and FCM env vars are in place; one deploy step from activating silent push on staging. APNs key: AuthKey_W3F6PU2NCQ.p8. Firebase project: m3shd-65a5a.

Key Dependencies
  • fastapi>=0.115.0 — ASGI framework
  • uvicorn[standard]>=0.34.0 — ASGI server with uvloop
  • sse-starlette>=2.2.1 — Server-Sent Events
  • aiosqlite>=0.21.0 — Async SQLite
  • cryptography>=42.0 — Memory encryption, HMAC provenance
  • anthropic>=0.40.0 — Claude API (evolution optimizer, AGI features)
  • httpx>=0.28.0 — Async HTTP client (federation, webhooks, FCM, ntfy)
Section 28

Test Coverage

The test suite spans 20+ test files covering all subsystems.

FileCoverage Area
test_db.pySchema init, CRUD for all tables, capacity queries, stale agent sweep
test_api.pyCore API endpoints, auth validation, input limits
test_dispatch.pyCircuit breaker, capacity routing, reputation scoring, deadline urgency
test_worker.pyWorker registration, Claude subprocess mock, task lifecycle
test_security.pyRBAC enforcement, IDOR checks, injection prevention
test_audit.pyAudit trail, provenance chain, HMAC verification
test_memory.pyAgent memory CRUD, FTS5 search, auto-extract, auto-inject, consolidation
test_deps.pyDependency creation, circular prevention, auto-dispatch, output injection
test_templates.pyTemplate CRUD, variable substitution, voice dispatch
test_webhooks.pySecret validation, HMAC, rate limiting, Uptime Kuma
test_rbac.pyPermission enforcement per endpoint, role presets, empty-perms fix
test_analytics.pyAll four analytics endpoints, cost rollups, uptime calc
test_plugins.pyPlugin registration, tool invocation, lifecycle hooks, path sanitization
test_evolution.pyPerformance tracking, amendment generation, operator approval
test_agi.pyMetacognition, model routing, NL control, reputation, motivation, adversarial, world model, debate, provenance, voting, benchmark
test_federation.pyPeer registration, task relay, hop limits, sweeper
test_fcm.pyFCM token registration, silent push, OAuth2 token caching
test_users.pyPBKDF2 hashing, session TTL, role-scoped access

Tests use pytest-asyncio for async support, in-memory SQLite for isolation, FastAPI TestClient for HTTP, and subprocess mocks for Claude CLI. Test fixture isolation (_db = None resets) prevents cross-contamination across files.

Section 29

Node Inventory

NodeMachineRoleMax ConcurrentStatus
ArchonMac Mini M2 (Apple M2 8-core)Primary AI, full tools, iMessage bridge5Online
RexMac Mini Intel (i5)Research, code, file ops2Online
CrucibleiMac 27" (Intel i5)Overflow compute, research3Online
N0D3-1iPhone 14 Pro MaxMobile AI worker1Online
N0D3-2iPhone 12 ProSecond mobile AI worker1Online
Fred CommanderM3SHD Commander AppDispatch and monitor0Online
RBAC Token Assignments
  • archon: admin role (17 permissions)
  • rex, crucible: worker role (10 permissions)
  • n0d3-1, n0d3-2: mobile role (9 permissions)
  • fred-commander: commander role (12 permissions)
  • Third-party integrations: API key auth with per-key caps

All desktop nodes connect via Tailscale. Mobile nodes connect via the public HTTPS endpoint. SSH key authentication on all desktop nodes. Crucible runs macOS Monterey 12.7.6 — Terminal.app (Ghostty incompatible below Ventura). Node.js installed from tarball (Homebrew broken on Monterey). SSL fixed via certifi + SSL_CERT_FILE environment variable.

Section 30

Technology Stack

LayerTechnologyVersionPurpose
Hub FrameworkFastAPI0.115+ASGI framework, API routing
ASGI ServerUvicorn0.34+Production server with uvloop
DatabaseSQLite (WAL)3.xPersistence (18 tables)
Full-Text SearchSQLite FTS53.xAgent memory search
Async DB Driveraiosqlite0.21+Non-blocking SQLite
SSEsse-starlette2.2+Server-Sent Events (30+ event types)
Cryptographycryptography42.0+Memory encryption, HMAC provenance
AI Runtime (desktop)Claude CLILatestSubprocess LLM inference
AI Runtime (mobile)anthropic_sdk_dart1.4.2Real SSE streaming on mobile
AI Runtime (hub)Anthropic Python SDK0.40+Evolution optimizer, AGI features
Reverse ProxyCaddy2.xTLS termination, HTTP/2
Container RuntimeDockerLatestHub deployment on VPS
Mesh VPNTailscaleLatestEncrypted node-to-node tunnels
BackupLitestreamLatestSQLite WAL replication to R2
Push NotificationsFirebase FCMHTTP v1Silent push wakeup for mobile
MCP Serveranthropic_mcpLatest17 tools for Claude Desktop/Code
Mobile FrameworkFlutter3.41iOS + Android apps
State Managementflutter_riverpod2.6.1Notifier/AsyncNotifier
Mobile Routinggo_router14.8.1Type-safe navigation
Secure Storageflutter_secure_storage9.2.4iOS Keychain, Android Keystore
Bridge (read)sqlite3 (stdlib)3.xchat.db read-only
Bridge (write)osascriptmacOSiMessage send via AppleScript
Language (backend)Python3.12All backend components
Section 31

Architecture Diagram

┌────────────────────────────────────────────────────────────────────┐
│                             INTERNET                                │
│                                                                     │
│   Fred's iPhone / Browser                                           │
│   ├── iMessage ──────────────────► chat.db (M2 Mac)                │
│   ├── M3SHD Commander app ───────────────────────────────┐         │
│   ├── N0D3 app ──────────────────────────────────────────┤         │
│   ├── Siri Shortcut ─── voice/dispatch ──────────────────┤         │
│   └── MCP (Claude Desktop/Code) 17 tools ────────────────┤         │
│                │ HTTPS                                    │         │
│                ▼                                          │         │
│   ┌──────────────────────────────────────────┐           │         │
│   │        mesh.demobygrit.com               │           │         │
│   │        Caddy (TLS + H2)                  │ ◄─────────┘         │
│   │              ▼                           │                     │
│   │   ┌────────────────────────────────┐     │                     │
│   │   │         M3SHD Hub              │     │                     │
│   │   │  FastAPI + SSE + SQLite WAL    │     │                     │
│   │   │  18 tables, FTS5, WAL mode     │     │                     │
│   │   │  RBAC (17 perms, 4 presets)    │     │                     │
│   │   │  Agent Memory (FTS5, enc.)     │     │                     │
│   │   │  Task Deps + Pipelines         │     │                     │
│   │   │  Plugin System (4 built-in)    │     │                     │
│   │   │  Self-Evolving Agents          │     │                     │
│   │   │  AGI Intelligence Layer        │     │                     │
│   │   │  Webhooks + Federation         │     │                     │
│   │   │  MCP Server (17 tools)         │     │                     │
│   │   │  3D WebGL Visualization        │     │                     │
│   │   └────────────────────────────────┘     │                     │
│   │         Staging VPS (Hetzner)            │                     │
│   └──────────────────────────────────────────┘                     │
│                   │                                                 │
└───────────────────┼─────────────────────────────────────────────────┘
                    │
          ── Tailscale WireGuard VPN ──
                    │
        ┌───────────┼──────────┐
        │           │          │
┌───────┴──────┐ ┌──┴───┐ ┌───┴──────┐
│ M2 Mac Mini  │ │ Rex  │ │ Crucible │
│ (Archon)     │ │Intel │ │ iMac 27" │
│              │ │      │ │          │
│ mesh-daemon  │ │ mesh-│ │ mesh-    │
│ T1: iMsg→Hub │ │worker│ │ worker   │
│ T2: Hub→iMsg │ │      │ │          │
│ T3: Archon   │ │claude│ │ claude   │
│ T4: Rex      │ │print │ │ print    │
│ session-brdg │ │2 con.│ │ 3 con.   │
└──────────────┘ └──────┘ └──────────┘

N0D3-1 (iPhone 14 Pro Max)    N0D3-2 (iPhone 12 Pro)
Flutter + anthropic_sdk_dart   Flutter + anthropic_sdk_dart
Real SSE streaming             Real SSE streaming
Offline queue + FCM wakeup     Offline queue + FCM wakeup
1 concurrent, handoff on disc  1 concurrent, handoff on disc
          │ HTTPS                        │ HTTPS
          └─────────────────────────────┘
                        ▼
                   M3SHD Hub
Data Flow: Siri Task Dispatch
1. "Hey Siri, research StoreKit 2 pricing models"
2. Siri Shortcut POSTs to POST /api/voice/dispatch with NL text
3. Hub parses intent → matches "Research {topic}" template
4. Template rendered with topic="StoreKit 2 pricing models"
5. Dispatcher selects Rex (highest UCB reputation for research)
6. Rex picks up task via poll, invokes Claude CLI
7. Claude returns research (FTS5 memory injected into prompt)
8. Rex scans output for [REMEMBER] blocks → POSTs to agent memory
9. Rex POSTs cost log + task done
10. ntfy fires push: "Research task complete on rex"
11. M3SHD Commander app receives SSE task_updated, shows done badge
Data Flow: Pipeline Execution
1. POST /api/pipelines with 3 task definitions
2. Hub creates task A (research), B (summarize, depends_on=[A]),
   C (notify, depends_on=[B])
3. Only task A is dispatched (B and C status=queued, deps unmet)
4. Rex completes task A → status=done
5. check_and_dispatch_dependents() fires
6. Task B's parent (A) is done → dispatch B with A's output injected
7. N0D3-1 picks up B, summarizes via Claude API
8. Task B completes → dispatch C
9. Archon picks up C → notify plugin fires ntfy push to Fred
10. Fred receives summary push notification
Section 32

Version History

M3SHD was developed across eight iterative phases, each building on a stable, tested foundation before proceeding.

v0.1 — Foundation

Seven-table schema, capacity-aware dispatch with circuit breaker, self-registering workers, generic mesh-worker.py, hardened iMessage bridge with retry queues. Command router with 14 commands. Context-aware rich prompts (chat vs task mode). Rex deployed as mesh worker on Intel Mac Mini. Approval queue, cost tracking with daily limits, escalation chain. Four-tab PWA. Autonomous mode (decompose + dispatch + monitor).

v0.5 — Mobile + Security Hardening

N0D3 mobile worker app (17 Dart files) — full worker contract, Claude Haiku direct, glassmorphism UI, background mode, task handoff on disconnect. M3SHD Commander app (25 Dart files) — five-tab command center. Full red team pentest: all critical findings resolved. SecurityHeadersMiddleware, login rate limiting, git history scrubbed, NSAllowsLocalNetworking enforced. Task handoff system. Text routing broadcast system. Hub stability fixes.

v0.8 — Intelligence Platform

Agent Memory (FTS5, encrypted, auto-extract, auto-inject). Task Dependencies + Pipelines. Task Templates (5 seeds, Commander chips). ntfy Alerting. Live Session Integration (SSE bridge, Claude Code hooks). Real Claude Streaming via anthropic_sdk_dart. Plugin System (4 built-in tools). Self-Evolving Agents (Haiku optimizer, operator approval). CI/CD Pipeline (deploy.sh, pre-commit, GitHub Actions). RBAC (17 permissions, 4 presets). Analytics Dashboard. Webhooks (GitHub HMAC, Uptime Kuma). Multi-User Support. Offline Task Queue.

v1.0 — AGI-Adjacent Layer

MCP Server (17 tools). Voice Command (Siri Shortcuts). 3D WebGL Mesh Visualization. Third-Party API Keys. Demo Mode. Cross-Mesh Federation (hub-to-hub relay, overflow, 3 hops). FCM Push Notifications (Firebase). Android Build Support. Multi-User Support. Second mobile node (iPhone 12 Pro). AGI-Adjacent Layer: Metacognition, Smart Model Routing, NL Mesh Control, Deadlines + Auto-Escalation, Reputation Scoring (UCB), Intrinsic Motivation, Adversarial Self-Improvement, World Model, Debate Protocol, Cryptographic Provenance, Memory Consolidation, Collective Voting, Benchmark Suite. Final security audit: all findings remediated. Production tokens patched with RBAC roles.

Section 33

Future Roadmap

  • FCM Staging Activation. Firebase service account and FCM env vars need to be deployed to the staging server. Silent push notifications are built and tested locally — one deploy step from activation.
  • App Store Submission. Both N0D3 and M3SHD Commander are signed and buildable. Required before submission: privacy manifests, App Privacy details in App Store Connect, and review of NSAllowsLocalNetworking justification.
  • Android Background Execution. WorkManager for background task execution on Android. The FCM integration is in place; WorkManager is the remaining piece for full background operation.
  • Litestream R2 Backup. Configuration is in place. Requires R2 bucket creation and credentials deployed to staging for point-in-time recovery.
  • Audit Log Table. An audit_log table recording every mutating operation with timestamp, action, actor, and payload hash. High value for forensic analysis now that all tokens are distinct per-agent.
  • Caddy-Level Rate Limiting. The current login rate limiter is in-process and resets on hub restart. Caddy-level rate limiting would persist across restarts and handle pre-TLS throttling.
  • Session Cookie Randomness. The session cookie is currently a deterministic SHA-256 of the auth token. Per-session randomness (HMAC of token + random nonce stored server-side) would prevent offline brute-force if logs are compromised.
  • Additional Nodes. Any machine with Python 3.12+ and Claude CLI can join by running mesh-worker.py. Any iPhone or iPad can join by installing N0D3. The federation system allows multiple independent M3SHD hubs to relay tasks across organizations.
  • Natural Language Monitoring. Extend the NL control endpoint to support monitoring queries: "how many tasks has rex completed this week", "which agent has the highest failure rate", "show me all tasks that took longer than 10 minutes".
Section 34

Changelog

VersionChanges
v0.1 Initial release. Seven-table schema. Four-tab PWA. iMessage bridge. Capacity-aware dispatch with circuit breaker. Autonomous mode. Approval queue. Cost tracking. Escalation chain. Two independent security audits completed.
v0.5 N0D3 mobile worker app (17 Dart files). M3SHD Commander app (25 Dart files). Glassmorphism UI. Comprehensive security hardening — all critical findings resolved, git history scrubbed. SecurityHeadersMiddleware, login rate limiting, NSAllowsLocalNetworking enforced. Task handoff system. Text routing broadcast system. Hub stability fixes.
v0.8 Agent Memory (FTS5, encrypted, auto-inject, auto-extract). Task Dependencies + Pipelines. Task Templates (5 seeds). ntfy Alerting. Live Session Integration (SSE bridge, hooks). Real Claude Streaming (anthropic_sdk_dart). Plugin System (4 built-in tools). Self-Evolving Agents. CI/CD Pipeline. Analytics Dashboard. Webhooks (GitHub HMAC, Uptime Kuma). RBAC (17 permissions, 4 presets). Multi-User Support. Offline Task Queue (N0D3). Test fixture isolation across all test files.
v1.0 MCP Server (17 tools). Voice Command (Siri Shortcuts). 3D WebGL Mesh Visualization. Third-Party API Keys. Demo Mode. Cross-Mesh Federation (hub-to-hub relay, overflow, 3 hops). FCM Push Notifications (Firebase). Android Build Support. Second mobile node (iPhone 12 Pro). AGI-Adjacent Layer: Metacognition, Smart Model Routing, NL Mesh Control, Deadlines + Auto-Escalation, Reputation Scoring (UCB), Intrinsic Motivation, Adversarial Self-Improvement, World Model, Debate Protocol, Cryptographic Provenance, Memory Consolidation, Collective Voting, Benchmark Suite. Final security audit — all findings remediated. Production tokens patched with RBAC roles.