Skip to content

Architecture

Technical depth for CTOs and evaluating engineers

Myco splits deterministic computation from LLM advisory work. Facts, graph edges, IDs, and timestamps are computed in code and persisted in Postgres. LLMs suggest. Deterministic systems decide what becomes durable memory.

1) System Flow

Client / Agent

↓ MCP tool call

MCP Server (tool validation + policy + RLS session context)

↓ deterministic writes + reads

Postgres 16 + pgvector (source of truth)

↑ advisory suggestions only

LLM Advisory Worker (embeddings, NER, relation proposals)

2) Deterministic vs Advisory Split

Deterministic: chunking, hashing, dedupe keys, graph link persistence, timestamps, audit, replay behavior. Advisory: embeddings, NER, relation proposals, confidence. Advisory output never bypasses deterministic validation and policy gates.

3) Schema Surface

Core tables for memory semantics and provenance. Query and graph tooling read these directly under workspace scope.

  • hyobjects
  • chunks
  • relations
  • canon_relations
  • evidence

4) Multi-Tenancy and RLS

Workspace isolation is enforced by Postgres Row-Level Security. Every query is scoped by session context (`app.workspace_id`, `app.principal_role`) before data access.

5) Idempotency Contract

Write paths require deterministic boundaries from THE-409B: `idempotency_key`, `trace_id`, and `raw_payload` capture. This enables replay safety, duplicate suppression, and post-incident audit.

6) MCP Tool Reference (9 tools)

ToolParams (shape)Return (shape)
brain.context_pack{ query, workspace_id?, top_k?, include? }{ chunks[], entities[], people[], notes[], score }
brain.search{ query, filters?, top_k?, workspace_id? }{ results[], scores[], source_refs[] }
brain.why{ object_id | fact_id, depth? }{ provenance_chain[], evidence[], sources[] }
brain.neighbors{ node_id, relation_types?, depth?, limit? }{ nodes[], edges[], traversal_meta }
brain.ingest{ text? | url? | file_base64?, source, metadata? }{ ingest_id, hyobject_id, chunk_count, status }
brain.propose_fact{ subject, predicate, object, confidence, evidence_refs[] }{ proposal_id, status, confidence, review_queue }
brain.annotate{ session_id, kind, content, tags? }{ note_id, recorded_at, visibility }
brain.save_memory{ text, tags?, workspace_id?, agent_id? }{ memory_id, chunk_ids[], embedding_model, status }
brain.recall_memory{ query, agent_id?, workspace_id?, top_k? }{ memories[], scores[], source_links[] }

7) Confidence Thresholds

Advisory confidence drives promotion gates. High-confidence suggestions can auto-promote by policy; medium-confidence routes into review queues; low-confidence is retained as non-authoritative context.

Precision/recall curve link will be attached once benchmark publication is complete.

8) Taxonomy Evolution

New categories and relation patterns are proposed through `schema_proposals`. Review + approve flow updates the active taxonomy. No runtime DDL mutation from LLM output.

9) Reliability Layer

Dead-letter queues isolate failed ingest/advisory jobs. Reconciler loops retry or quarantine failures with diagnostics. Replay is safe because writes are idempotent and provenance-linked.

10) Security and Portability

No lock-in architecture: customer-owned Postgres, open SQL migrations, and portable deterministic records. Data ownership and extraction paths remain under workspace control.