Meridian — Base Model Specifications v1

Layer 1

Schema & Data Architecture

The foundation of everything. If the schema is wrong, every codex, every collective emission, every synthesis is broken. These specs must be locked before any other layer ships.

SPEC-001

Universal Node Schema

SOLVED

Problem

All nodes across all builds must use identical field definitions or codex interoperability breaks.

Solution

NODE_SCHEMA v3: id (UUID), vector (1024-dim float32), text, title, node_type, source_id, framework_id, confidence_score, gravity_score, tags (csv), mechanism, situation, when_not, collection, date_added. Validated in production with 6,797 nodes.

Owner

SPEC-002

Embedding Model Standardization

PARTIAL

Problem

Q uses BGE-M3 (1024-dim). Rob uses nomic-embed-text (768-dim). Different dimensions cannot be merged. Codex packs must work across all builds.

Solution

Standardize on BGE-M3 (1024-dim) as specified in whitepaper. Open: BAAI/bge-m3 (1.3GB) or bge-m3-v2 if released. Must run on CPU (no GPU dependency for client builds).

Reconcile

Rob must migrate 16,717 holons from 768 → 1024-dim. Options: (a) re-embed all holons with BGE-M3 (batch job, ~4h on CPU), (b) maintain dual-embed with translation layer (complex, fragile — not recommended), (c) Rob's system stays independent until codex exchange begins. Recommend (a).

Research

Benchmark BGE-M3 vs nomic-embed-text on our actual data. Is retrieval quality materially different? If nomic is significantly better for Rob's use case, we need to justify the switch with data, not preference.

Owner

ROB (migration) Q (benchmark)

SPEC-003

Gravity Score Formula

OPEN

Problem

Will identified that neither confidence alone (funnel-scoped) nor error count alone (failure-scoped) is sufficient. Need a unified "weight of evidence" that integrates multiple signals.

Proposed

gravity = w1*confidence + w2*norm(validation_count) + w3*consistency + w4*(1 - norm(error_count)) + w5*recency
Where consistency = average confidence of semantically adjacent nodes (cosine ≥ 0.85). Recency = decay function on time since last validation.

Reconcile

Weights (w1–w5) must be calibrated empirically. Start with equal weights (0.2 each), then tune based on which ranking produces better retrieval quality. Need a ground-truth eval set — 50 queries where we know the "right" top-5 results. Q builds eval set from existing TAO, all three score it.

Research

How does gravity interact with the synthesis pipeline? Should the brief assembler cluster by gravity instead of confidence? Does gravity replace confidence in codex metadata or supplement it?

Owner

WILL (spec) Q (implementation)

SPEC-004

Confidence History / Belief Versioning

OPEN

Problem

No audit trail for how beliefs evolved. A principle at 0.4 that was once 0.9 is fundamentally different from one that was always 0.4 — but the system can't distinguish them.

Proposed

SQLite table: confidence_history (principle_id, old_score, new_score, timestamp, trigger [ingestion|performance|manual|hardening|codex], context TEXT). Principles dropping >0.3 from peak flagged for review.

Reconcile

Storage cost: at 6,797 nodes with avg 5 changes/year = 33,985 rows/year. Negligible. But should this be in LanceDB (vector-searchable) or SQLite (relational query)? Recommend SQLite — confidence history is queried by principle_id, not by similarity.

Owner

WILL (spec) Q (implementation)

SPEC-005

Error Bank Schema

PARTIAL

Problem

Failures are not first-class knowledge. When a principle informs an action that fails, the failure context is lost.

Solution

Rob's errors.lance collection exists in GHOSTNET. Adapt to NODE_SCHEMA: error_type (prediction_failure | action_failure | ingestion_error | synthesis_error), related_principle_id, context, outcome, severity. Embeds on the error description for similarity search ("have we seen this kind of failure before?").

Reconcile

Does the error bank use NODE_SCHEMA (same as principles) or a separate ERROR_SCHEMA? Separate is cleaner but breaks codex portability. Recommend NODE_SCHEMA with node_type='error' and extra fields in tags/mechanism.

Owner

ROB

SPEC-006

Edge Threshold Calibration

PARTIAL

Problem

0.85 cosine threshold validated on Q's TAO (marketing/psychology domain). May not generalize to health, finance, or technical domains where semantic similarity patterns differ.

Solution

Current: fixed 0.85 across all collections. Proposed: per-domain threshold calibrated during hardening. Health principles may need 0.82 (more varied language). Finance may need 0.88 (more precise language).

Research

Test edge quality at 0.80, 0.85, 0.90 on Rob's security/trauma domain data and Will's political/financial domain data. Compare precision (are the edges real?) and recall (are we missing valid edges?).

Owner

ALL

Layer 2

Agent Core

The reasoning engine that sits on top of the knowledge bank. Must work with any local model, maintain identity across sessions, and be configurable per client.

SPEC-007

Agent Loop Architecture

PARTIAL

Problem

Q's agent_loop is tightly coupled to VOHU MANAH's specific tools and context assembly. Rob's daemon is 25K lines and tightly coupled to GHOSTNET's holonic structure. Neither is portable as-is.

Solution

Extract a clean agent loop: message → context assembly (SPINE + RAM + retrieved knowledge) → LLM call → tool execution → response → state update. Tools registered via config, not hardcoded. Context assembly is pluggable per domain module.

Reconcile

Q's tool-use loop (core/tools.py, 1,150 lines) works with Anthropic's tool-use API. Rob's works with Ollama function calling. The base template must support both — abstract the tool-call interface so the agent loop doesn't care which LLM provider is behind it.

Owner

Q (architecture) ROB (Ollama compatibility)

SPEC-008

SPINE / RAM / Beliefs Hierarchy

PARTIAL

Problem

Three identity layers exist (Q's SPINE+RAM, Rob's beliefs.json). How do they interact? What's the priority when they conflict?

Solution

SPINE (immutable identity, manual edit only) > Beliefs (core axioms, deliberate review to change) > RAM (rolling state, auto-updated). SPINE overrides beliefs. Beliefs override RAM. Conflicts resolved by hierarchy.

Reconcile

Format: SPINE stays as markdown (human-readable, easy to edit). Beliefs as JSON (machine-readable, schema-enforced). RAM as markdown (free-form, auto-truncated at ~80 lines). The agent context assembler loads all three in priority order.

Owner

Q ROB

SPEC-009

Model Abstraction Layer

OPEN

Problem

The base template must work with any local model (Ollama, llama.cpp, vLLM) and optionally cloud models (Anthropic, OpenAI). Q's llm.py is Anthropic + Ollama specific. Clients may use different providers.

Proposed

Unified interface: call(messages, model_key, tools=None) → response. Provider selected by config.yaml. Model registry maps model_key → provider + model_id + context_window + cost. Cascade logic optional per deployment.

Research

OpenAI-compatible API (which Ollama supports) as the universal interface? Or custom abstraction? LiteLLM as a dependency vs. rolling our own? LiteLLM adds a dependency but supports 100+ providers out of the box.

Owner

Q (architecture) ROB (local model testing)

SPEC-010

Inter-Agent Message Bus

OPEN

Problem

Will identified: neither system has agent-to-agent communication. VOHU agents talk via shared DB. GHOSTNET swarm talks via supervisor. No direct bus.

Proposed

Lightweight message bus: named channels per domain, priority levels, structured messages {from, to, channel, priority, payload, timestamp, requires_response}. Implementation: SQLite table (simple, no external deps) or Redis (fast, adds dependency).

Reconcile

Recommend SQLite for MVP. No external dependency. Polling-based (agent checks its inbox on each loop iteration). Upgrade to Redis/NATS later if latency matters. For the base model sold to clients, simplicity > performance.

Owner

WILL (spec) Q (implementation)

SPEC-011

Foundational Pact (Machine-Readable Laws)

PARTIAL

Problem

The immutable laws (from founding exercises) must be loaded into every agent's context. Not as a suggestion — as a hard constraint that the system cannot violate.

Solution

Rob's foundational_pact.txt loaded as system-level context before SPINE. Format: structured YAML with law_id, text, enforcement_type (hard_block | soft_warning | audit_log). Hard blocks prevent the action. Soft warnings flag but allow. Audit logs record for review.

Reconcile

The three founding laws are clear (for mankind, sovereign data, no weaponization). But how do you enforce them in an LLM-based system? LLMs don't have hard constraints — they have probabilistic compliance. The pact must be: (1) in system prompt (probabilistic), (2) in output validator (deterministic — scan responses for violations), (3) in tool permissions (structural — certain tools simply don't exist).

Owner

ROB (format) ALL (content)

Layer 3

Ingestion & Knowledge Processing

How raw input becomes structured knowledge. The pipeline must handle text, audio, and (v3) images — all producing NODE_SCHEMA output.

SPEC-012

Ingestion Pipeline Portability

PARTIAL

Problem

Q's pipeline (knowledge/ingest/) is 12 files tightly integrated with VOHU MANAH's specific collections and model assignments. Not portable as-is.

Solution

Extract core pipeline: source → chunk → extract → embed → store. Config-driven: extraction model, chunk size, target collection all in config.yaml. No hardcoded references to specific collections or model keys.

Owner

SPEC-013

Multi-Modal Input Pipeline

RESEARCH

Problem

Will identified: neither system treats images, screenshots, or UI data as first-class inputs. Text and audio only.

Proposed

Image → vision model (local: LLaVA, MiniCPM-V; cloud: Claude vision) → text description → standard text pipeline. Screenshot → OCR (Tesseract/PaddleOCR) + layout analysis → structured text. All modalities produce NODE_SCHEMA output. Embedding is always text-based (BGE-M3).

Research

Which local vision model fits in 8GB VRAM alongside the primary inference model? LLaVA-1.6 7B (Q4 ~4GB) is promising but untested on our hardware. Can we run vision model on CPU while inference uses GPU? Benchmark needed.

Owner

ROB (local model research) WILL (spec)

SPEC-014

Automated Ingestion Triggers

OPEN

Problem

Ingestion is currently manual (Q triggers) or scheduled (6am daily collectors). No event-driven processing.

Proposed

Filesystem watcher (watchdog library) on /inbox/ directory. New file → detect type → route to appropriate pipeline → ingest → harden → notify. Webhook endpoints for API-driven triggers (voice note received, analytics collected, codex installed).

Reconcile

Platform compatibility: watchdog works on Windows (Q), macOS (Rob), Linux (client servers). Webhook server adds a running process — is this acceptable for the base template? Recommend: filesystem watcher as default, webhook server as optional domain module.

Owner

WILL

SPEC-015

Orchestration with Rollback (Knowledge CI/CD)

OPEN

Problem

No safety net for knowledge operations. A bad ingestion batch or codex install can corrupt the knowledge bank with no way to revert.

Proposed

Before every batch operation: (1) snapshot current state (snapshot.py already exists), (2) run operation against staging copy, (3) compare: connectivity %, high-gravity node integrity, duplicate count, (4) if regression detected → rollback + log to error bank + alert. If clean → commit to main.

Reconcile

LanceDB doesn't have native transactions. Rollback means: restore from snapshot. Current snapshot captures counts but not full data. Need: full LanceDB backup before each batch operation. Storage cost: ~200MB per snapshot at 6,797 nodes. Acceptable for daily, expensive for per-operation. Compress with zstd?

Research

LanceDB versioning (lance format supports time-travel queries). Can we use native versioning instead of full backup? Would eliminate the storage cost entirely.

Owner

WILL (spec) Q (LanceDB versioning research)

SPEC-016

Codex Import Validation

PARTIAL

Problem

When a client installs a codex, how do we validate it's not corrupted, poisoned, or schema-incompatible?

Solution

Codex validation: (1) schema check — all required NODE_SCHEMA fields present, (2) embedding dimension check — all vectors are 1024-dim, (3) signature verification — codex signed by issuer (Meridian or authorized expert), (4) anomaly scan — statistical check that embeddings cluster normally (poisoned data shows distributional anomalies), (5) rollback guard wraps the full install.

Owner

Q (schema) ROB (signature + anomaly)

Layer 4

Security & Sovereignty

The non-negotiable layer. If a client's data leaks, the business is over. These specs are Rob's domain.

SPEC-017

At-Rest Encryption

PARTIAL

Problem

Neither Q's nor Rob's current system encrypts the LanceDB files at rest. If hardware is stolen, data is exposed.

Solution

AES-256 encryption on the knowledge.db directory. Decrypted on mount (using FUSE/VeraCrypt container or OS-level encryption). Hardware key (YubiKey) required to unlock. Data exists unencrypted only in RAM during operation.

Reconcile

VeraCrypt container vs. OS-level (BitLocker/FileVault/LUKS). VeraCrypt is cross-platform but adds friction. OS-level is seamless but platform-dependent. Recommend: OS-level as default + VeraCrypt guide for paranoid clients.

Owner

ROB

SPEC-018

Network Isolation

SOLVED

Problem

The system must operate fully offline. No phone-home, no telemetry, no cloud dependency for core function.

Solution

All inference local (Ollama). All embeddings local (BGE-M3). All storage local (LanceDB + SQLite). Internet required only for: (a) optional cloud model access, (b) collective synthesis layer connection, (c) codex downloads. All three are opt-in. Core function is 100% offline. Validated: Rob's GHOSTNET runs fully air-gapped on Raspberry Pi.

Owner

ROB

SPEC-019

Sanitization Pipeline

PARTIAL

Problem

When the system ingests external data (web search, API responses, downloaded documents), how do we prevent data poisoning or personal data leakage?

Solution

Rob's sanitization pipeline from GHOSTNET: all external inputs pass through (1) PII detection (regex + NER model), (2) content classification (technical/personal/commercial), (3) domain relevance check (is this related to the active query?), (4) output redaction (strip detected PII before storage). Only sanitized content touches the knowledge bank.

Research

Which NER model for PII detection runs locally on CPU? spaCy (fast, good English) vs. GLiNER (multilingual) vs. Presidio (Microsoft, comprehensive but heavier). Benchmark on speed and accuracy for our use case.

Owner

ROB

SPEC-020

Heartbeat & Health Monitoring

PARTIAL

Problem

If a daemon crashes, a model goes offline, or the knowledge bank corrupts — the system should detect and alert, not fail silently.

Solution

Rob's heartbeat protocol: heartbeat.json updated every 60s with: timestamp, active_models, knowledge_bank_size, last_backup, daemon_status, disk_space. If heartbeat stops for >5min, recovery daemon triggers restart. Dashboard shows system health at a glance.

Owner

ROB

SPEC-021

Kill Switch Protocol

SOLVED

Solution

Physical. Pull ethernet. Power down. No remote override possible. The hardware is the user's. Period. This is a design principle, not a feature.

Owner

ALL

Layer 5

Interface & Experience

How the client actually interacts with their sovereign AI. Must be simple enough for non-technical users, powerful enough for operators.

SPEC-022

Primary Chat Interface

OPEN

Problem

Q uses Telegram + Dash + Copilot (4 interfaces). Rob uses AnythingLLM. Clients need ONE primary interface that works out of the box.

Options

(a) AnythingLLM — already supports LanceDB, local models, multi-workspace. Rob has experience. But: closed-source core, limited customization, dependency risk.
(b) Open WebUI — open-source ChatGPT-like interface. Supports Ollama natively. Extensible. Active development. But: no native LanceDB integration — would need custom RAG plugin.
(c) Custom PWA — full control, matches Meridian brand. But: significant build effort. Not MVP-viable.
(d) Telegram/Signal bot — zero install friction, works on phone. But: limited UI, no visual dashboard.

Reconcile

Recommend: Open WebUI + custom RAG plugin for MVP. It's open-source (no dependency risk), works with Ollama, has conversation history, supports multiple models. We build a LanceDB retrieval plugin that injects context from the knowledge bank. Dashboard comes later as a separate service.

Owner

ROB (evaluation) Q (RAG plugin)

SPEC-023

Voice Input

SOLVED

Solution

WhisperX (local, open-source) for transcription. Both Q and Rob already use this. Runs on CPU. Input: voice note (any format) → WhisperX → text → standard agent pipeline. Validated in production (35+ voice logs in GHOSTNET, Q's Telegram voice pipeline).

Owner

ROB

SPEC-024

Dashboard / State Viewer

RESEARCH

Problem

Clients need to SEE their knowledge bank growing — node counts, connectivity, domains active, recent ingestions, system health. The chat interface alone doesn't provide this.

Options

(a) Dash/Plotly (Q's stack) — powerful but Python-heavy. (b) Static HTML generated on each snapshot — lightweight, no server. (c) Obsidian plugin (Rob's workflow) — familiar to some clients. (d) Defer to post-MVP — chat + CLI is enough for founding clients.

Reconcile

Recommend: defer to post-MVP. The founding 3–5 clients are operators who can use CLI + chat. The dashboard is a retention feature, not an acquisition feature. Build it after first revenue.

Owner

Layer 6

Resilience & Autonomy

SPEC-025

Dream Engine

RESEARCH

Problem

Active synthesis is deliberate — you tell it what to synthesize. Background synthesis (dream cycles) finds connections the active pipeline doesn't look for.

Solution

Rob's holonic dream phase: idle-period processing where the system randomly samples N principles, attempts cross-domain connections, and stores discoveries in dreams.lance. Surfaced to main KB when confidence crosses threshold.

Research

What triggers "idle"? Cron (every 4h)? Low-activity detection? How do you prevent dream cycles from consuming compute needed for interactive queries? Priority: nice-to-have for MVP, critical for v2.

Owner

ROB

SPEC-026

Ghost Swarm (Autonomous Workers)

RESEARCH

Problem

Some tasks (deep research, batch processing, web search) block the main agent. Autonomous workers handle these in parallel.

Solution

Supervisor dispatches tasks to specialized workers. Workers operate independently, report results back. Sanitization pipeline ensures all external data is clean before it touches the knowledge bank.

Research

Resource constraints: running supervisor + 2–3 workers + primary agent on 64GB RAM / 8GB VRAM. Can we run workers on smaller models (Qwen 2.5 7B) while the primary uses 30B+? Worker model selection by task type?

Reconcile

Priority: post-MVP. The base model ships with a single agent. Swarm is an upgrade module. Founding clients get it in their first quarterly update.

Owner

ROB

SPEC-027

Approval Queue (Human-in-the-Loop)

PARTIAL

Problem

Some operations are too consequential for automatic execution: bulk ingestion, codex install, collective emission, belief modification.

Solution

Approval queue: system proposes, user confirms. Proposals stored in SQLite with: action_type, payload_summary, risk_level, proposed_at, status (pending/approved/rejected). Surfaced in chat: "I'd like to ingest 47 documents. This will add ~200 nodes. Approve?"

Owner

ROB

Layer 7

Collective Layer (Post-MVP)

Not in the base model sold to clients. Built internally for the founding three first. Opened to the founding 33 after validation.

SPEC-028

Synthesis Emission Protocol

OPEN

Problem

How does a sovereign node package and emit a principle to the collective? What's the format, the transport, the validation?

Proposed

Emission packet: {principle text, confidence, gravity, domain, validation_count, confidence_history, error_count, signature, timestamp}. Transport: signed JSON over HTTPS to Mother TAO endpoint (early), or peer-to-peer gossip protocol (mature). Validation: schema check + signature verify + anomaly detection.

Research

Zero-knowledge proofs for unattributable emission. ZKP is computationally expensive. Is it necessary for 3 founders? Probably not. At 33 founders? Maybe. At 100+? Definitely. Phased approach: simple signature for founders, ZKP when circle expands.

Owner

Q (protocol) ROB (crypto)

SPEC-029

Mother TAO Architecture

OPEN

Problem

Where does the Mother TAO run? Who hosts it? How is it secured?

Options

(a) Hosted by Q (centralized, simple, trust-dependent). (b) Hosted on shared VPS with multi-sig access (semi-distributed). (c) Replicated across all three founders (fully distributed, complex). (d) IPFS/Arweave for the knowledge bank + coordination server for synthesis (hybrid).

Reconcile

Recommend: (b) for MVP. Shared VPS (Hetzner, no-logs jurisdiction). All three founders hold SSH keys. LanceDB replicated nightly to all three founders' machines as backup. Migrate to (c) or (d) when scale demands it. Don't over-engineer the collective before it has 3 users.

Owner

ROB (infrastructure) Q (schema)

SPEC-030

Codex Poisoning Defence

OPEN

Problem

Rob's attack vector: training data poisoning via controlled assets, untraceable manipulation. How do you detect and prevent poisoned codexes or synthesis emissions?

Proposed

Multi-layer defence: (1) Statistical anomaly detection on incoming principles (distributional shift from existing knowledge), (2) Minimum validation_count threshold for collective emission (can't emit untested principles), (3) Cross-validation — a principle must be independently validated by ≥2 nodes to enter Mother TAO at high gravity, (4) Audit trail — all emissions logged, reviewable by any founder.

Research

How do you detect "gradually shifted" poisoning where each individual emission is within normal distribution but the aggregate shifts the knowledge bank? This is the hardest attack vector. Statistical drift detection over time windows?

Owner

ROB (security) WILL (detection logic)

Layer 8

Self-Evolution Infrastructure

The defining architecture. Not a feature — the foundational capability that makes the system compound. The infrastructure provides three temporal layers that any agent can inherit. The agents themselves, their names, their domains — those emerge from the client's use via Seed Codexes. The infrastructure just needs to support: a past, a future, a dreaming bridge, and the mutation flow between user and agents.

Critical framing: The base model does NOT ship with predefined agents (no "Builder", "Oracle", "Writer"). It ships with the infrastructure for temporal agents + a Seed Codex that guides the client through creating their first agents during onboarding. The agents emerge from the client's needs and evolve with their mutations. This is what makes every build unique.

SPEC-031

Agent Activity Log (Past Layer)

PARTIAL

Problem

Agents need a filtered personal record of what they did, what worked, and what failed. Not the system-wide log — the agent's own perspective on its own performance. Must be queryable for dream cycles.

Proposed

Per-agent SQLite table: agent_activity (agent_id, action, outcome, success bool, error text, lesson text, manifesto_alignment float, timestamp). Filtered: each agent sees only rows matching its agent_id. Queryable by time range, success/failure, manifesto alignment score.

Reconcile

Q's current system logs activity to a shared activity_log table with type/domain filters. Rob's GHOSTNET logs to action_log.json. Both need to be adapted into per-agent filtered views. Recommend: shared table with agent_id column + filtered views, not separate tables per agent. Simpler, and the dream cycle just queries WHERE agent_id = self.

Owner

SPEC-032

Agent Manifesto (Future Layer)

PARTIAL

Problem

Each agent needs a living document describing what it aspires to become — mission, capabilities, aspirations, growth metrics, acknowledged gaps. This document must evolve with user mutations and agent dream outputs.

Proposed

MANIFESTO.md per agent (same memory/ directory as SPINE.md and RAM.md). Structured sections: mission, current_capabilities, aspirations, growth_metrics, acknowledged_gaps. The manifesto is the north star for the dream cycle — "am I getting closer to this?"

Reconcile

Q's agents already have MANIFESTO.md files (SELENE, PYTHIA, LUMENA). Rob's GHOSTNET doesn't have a manifesto equivalent — the Queen has beliefs.json but no forward-looking aspiration document. The manifesto is a new concept for the base model — validated in Q's system, needs to be formalized as infrastructure.

Research

How does the manifesto update? Manual edit (like SPINE)? Agent-proposed + user-approved? Automatic from user mutations? Recommend: hybrid. User mutations auto-propagate (the infrastructure handles this). Agent dream proposals require user approval. Manual edit always available.

Owner

SPEC-033

Dream Cycle Engine

OPEN

Problem

The bridge between past and future. Must: (1) reflect on activity log for patterns, (2) compare current trajectory to manifesto aspirations, (3) retrieve relevant principles from Knowledge Bank, (4) synthesize course-correction mutations. Must run during idle periods without blocking interactive queries.

Proposed

dream_cycle(agent_id) → reads last N activity entries + current manifesto + KB query → LLM generates: mutations[] (proposed behavioral changes with rationale + risk_level + manifesto_alignment), dream_log (audit record), manifesto_update (if the manifesto itself needs evolving). Trigger: cron (every 4h idle), or on-demand, or after significant events (e.g. 10+ new activity entries).

Reconcile

Rob's GHOSTNET has holonic_dream_phase.py and shadow_subconscious.py — these are dream engines but not connected to a manifesto concept. Q's system has no dreaming at all. The Meridian dream cycle is a new synthesis: Rob's dream mechanism + Q's manifesto concept = dreaming that learns from the past to reach the future.

Research

Resource management: dreaming uses LLM inference. How to prevent dream cycles from consuming compute needed for interactive queries? Options: (a) run only when system is idle >30min, (b) use a smaller model for dreaming (e.g. 7B) while primary uses 30B+, (c) queue dreams and execute during scheduled windows (e.g. 3am). What's the right default?

Owner

ROB (dream mechanism) Q (manifesto integration)

SPEC-034

Mutation Protocol

OPEN

Problem

How do mutations flow? User changes propagate to agents. Agent dreams propose changes back. These are bidirectional but asymmetric — user mutations are authoritative, agent mutations require approval.

Proposed

Mutation types: (1) user_mutation — user changes priorities/domains/knowledge → auto-propagates to relevant agent manifestos as aspiration updates. (2) dream_mutation — agent dream cycle proposes a change → enters approval queue → user accepts or rejects. (3) collective_mutation — Mother AI innovation feeds back → enters agent dream cycle as context for next reflection. Mutation format: {type, source, target_agent, change_description, rationale, risk_level, requires_approval bool}.

Reconcile

The approval queue (SPEC-027) handles dream_mutations. But user_mutations need a detection mechanism — how does the infrastructure KNOW the user's priorities changed? Options: (a) explicit user command ("I'm shifting to health"), (b) inferred from ingestion patterns (ingesting health docs → health mutation), (c) inferred from query patterns (asking health questions → health mutation). Recommend: (a) for MVP + (b) as enhancement.

Owner

SPEC-035

Seed Codex (Agent Bootstrap)

PARTIAL

Problem

The base model ships with no agents. The client needs a guided way to create their first agents — defining identities, domains, initial manifestos. This is the Seed Codex.

Proposed

The Seed Codex is a special codex that: (1) interviews the client about their life domains, priorities, and working style, (2) proposes an initial agent configuration (e.g. "based on your needs, I recommend 3 agents: one for your business operations, one for your health tracking, one for your creative work"), (3) creates SPINE.md + MANIFESTO.md + beliefs.json for each proposed agent, (4) initializes empty activity logs, (5) self-destructs once agents are configured — the Seed UI dissolves.

Reconcile

The Seed UI concept already exists in the whitepaper (for codex onboarding). The Seed Codex extends this: it's not just ingesting knowledge — it's creating the agents themselves from the client's responses. This is the most important onboarding experience in the entire system. If it's bad, the client starts with misaligned agents. If it's great, the system immediately feels personal.

Research

What questions does the Seed Codex ask? How many agents does it propose by default? Should there be a minimum (1) or recommended (3-5)? Should the Seed Codex come with pre-built agent archetypes (Operator, Builder, Oracle, Writer) as starting templates that mutate, or should every agent be built from scratch?

Owner

Q (design) WILL (client experience)

SPEC-036

Dream Output → Knowledge Bank Pipeline

OPEN

Problem

Agent dream outputs contain lessons and self-corrections. These need to flow into the Knowledge Bank as principles and errors — enriching the shared substrate for all agents.

Proposed

After each dream cycle: (1) extract any new principle from the dream output (e.g. "I learned that X approach fails when Y condition exists"), (2) store in Knowledge Bank with node_type='dream_insight' and source_id=agent_id, (3) store any error identified in errors collection, (4) run mini-hardening (edge building only) to connect dream insights to existing principles. Dream insights start at confidence 0.5 — they need real-world validation to rise.

Owner

Q ROB

SPEC-037

Collective Dream Protocol

RESEARCH

Problem

The Mother AI needs to dream across the collective — receiving anonymized dream outputs from all nodes, finding cross-node patterns, and producing innovations that feed back as mutations to individual agents.

Proposed

Collective dream cycle: (1) receive anonymized dream_insight principles from all connected nodes (same emission protocol as SPEC-028), (2) cluster by similarity — "3 nodes independently discovered the same pattern", (3) synthesize collective-level innovations from multi-node convergences, (4) broadcast innovations back to all connected nodes as collective_mutations, (5) individual agents receive these in their next dream cycle as additional context.

Research

How many nodes are needed before collective dreaming produces meaningful innovations? With 3 founders, the sample size is tiny. At 33, it's useful. At 100, it's transformative. For the 3-founder stage: collective dreaming is manual — the three founders share dream insights in their weekly sync and Q feeds the convergences to Mother AI. Automated at 33+.

Owner

Q (protocol) ROB (infrastructure)

MVP Gate

What Must Ship vs. What Can Wait

TAO Principle [0.85]: "Build exceptional product first, then market becomes easier through organic social proof." The base model doesn't need every feature. It needs the features it ships with to work flawlessly.

Must Ship (MVP)	Status	Owner
Universal Node Schema (SPEC-001)	SOLVED	Q
Embedding standardization (SPEC-002)	PARTIAL — needs Rob migration	ROB
Agent loop + SPINE/RAM/Beliefs (SPEC-007, 008)	PARTIAL — needs extraction from VOHU	Q
Ingestion pipeline (text + audio) (SPEC-012)	PARTIAL — needs portability refactor	Q
Hardening pipeline (dedup, edges, frameworks)	SOLVED	Q
Synthesis pipeline (brief → seed → trunk → leaf)	SOLVED	Q
Chat interface (SPEC-022)	OPEN — needs evaluation	ROB
Voice input (SPEC-023)	SOLVED	ROB
Network isolation (SPEC-018)	SOLVED	ROB
At-rest encryption (SPEC-017)	PARTIAL — needs implementation	ROB
Kill switch (SPEC-021)	SOLVED	ALL
Codex import + validation (SPEC-016)	PARTIAL	Q ROB
Foundational pact (SPEC-011)	PARTIAL — needs enforcement layer	ROB
Model abstraction (SPEC-009)	OPEN	Q
Approval queue (SPEC-027)	PARTIAL	ROB
Agent activity log (SPEC-031)	PARTIAL — needs per-agent filtering	Q
Agent manifesto (SPEC-032)	PARTIAL — exists in VOHU, needs formalization	Q
Seed Codex (SPEC-035)	PARTIAL — Seed UI concept exists, needs agent bootstrap logic	Q WILL

Can Wait (Post-MVP / v2)

Feature	Why It Can Wait
Gravity score (SPEC-003)	Confidence alone works for MVP. Gravity is an optimization.
Confidence history (SPEC-004)	Logging can start on day 1 with a simple table. Full UI later.
Multi-modal input (SPEC-013)	Text + audio covers 95% of use cases. Images are an expansion.
Agent message bus (SPEC-010)	Single agent is sufficient for MVP. Bus needed when swarm ships.
Automated triggers (SPEC-014)	Manual ingestion is fine for first clients. Automate later.
Rollback (SPEC-015)	Snapshots provide manual rollback. Automated CI/CD is v2.
Dream engine (SPEC-025)	Active synthesis is sufficient. Background synthesis is enhancement.
Ghost swarm (SPEC-026)	Single agent handles founding client workload.
Dashboard (SPEC-024)	CLI + chat for operators. Dashboard is retention, not acquisition.
Dream cycle engine (SPEC-033)	Activity log + manifesto work without dreaming. Dreaming is the self-correction layer — powerful but not MVP-blocking.
Mutation protocol (SPEC-034)	Manual manifesto updates work for MVP. Automated mutation flow is v2.
Dream → KB pipeline (SPEC-036)	Requires dream engine. Deferred with it.
Collective dream protocol (SPEC-037)	Requires 33+ nodes. Manual at 3 founders.
Collective layer (SPEC-028–030)	Built for the 3 founders first, not sold to clients.

MVP count: 18 specs must be solved. 6 already solved. 10 partial (need finishing). 2 open (need decisions). Estimated engineering effort: 4–5 weeks with all three founders contributing their domains in parallel. The self-evolution infrastructure (activity log + manifesto + seed codex) ships with MVP. Dreaming and mutation automation are v2.

Status	Count	Meaning
SOLVED	6	Validated in production. No further work needed.
PARTIAL	13	Solution exists but needs porting, finishing, or reconciliation.
OPEN	11	Problem defined. Needs decision + implementation.
RESEARCH	7	Needs investigation before a solution can be proposed.

Owner	Primary	Shared
Q	Schema, agent loop, ingestion, synthesis, model abstraction	Gravity formula, codex validation, Mother TAO
ROB	Encryption, network, sanitization, heartbeat, dream engine, swarm, interface eval	Embedding migration, foundational pact, codex validation
WILL	Gravity spec, temporal reasoning, message bus, triggers, rollback	Edge calibration, poisoning detection

Can Wait (Post-MVP / v2)

Ownership Distribution