Skip to content
← Changelog
v1.2.02026-06-12

Compounding confidence, full dynamic schema, per-object privacy, in-repo benchmark

With this release, everything on the original launch roadmap has shipped.

Compounding confidence — the full engine

A fact’s confidence now rises with independent corroboration and falls on contradiction — and contradicted facts are superseded, never silently overwritten.

Verified by npm run test:compounding — the full lifecycle end-to-end against a live database, plus 17 unit tests. Proven live with llama3.2:3b: two documents asserting different employers produce

Rhea Calloway -[works for]-> Halcyon Labs        conf=0.600  [SUPERSEDED]
Rhea Calloway -[works for]-> Driftwood Analytics conf=1.000  [ACTIVE]

Full dynamic schema — gated auto-promotion

The propose-and-surface loop from v1.1.0 now completes: proposals corroborated by enough distinct documents at high confidence auto-promote into the live catalogs, and the promoted type is immediately usable by the next extraction batch. Strictly opt-in (BRAIN_SCHEMA_AUTO_PROMOTE=1, defaults: 3 independent documents at ≥0.8 confidence), with a full audit trail on the proposal row — and strict curation mode always wins. Verified by npm run test:schema-promotion.

Per-object privacy

Documents marked private are readable only by the agent that created them (plus service-role callers) across all six read tools — brain_search, brain_context_pack, brain_recall_memory, brain_why, brain_neighbors, and brain_get_related. Private rows with no recorded creator stay hidden from non-service callers, conservative by design. Workspace, org, and public documents behave exactly as before. Verified by a two-agent visibility-matrix check, npm run test:sharing.

LongMemEval benchmark harness — in the repo

The headline number is reproducible by anyone, not asserted: 73.6% end-to-end QA accuracy on the complete 500-question LongMemEval oracle subset (no sampling) with 100% evidence-retrieval recall — reader gpt-4o-mini, judge gpt-4o. The harness ships in-repo at evals/longmemeval — self-contained, per-question workspace isolation, with methodology in its README. Run it yourself.

Run the checks yourself

Full details in the repository changelog. All changes are additive — no breaking changes to any brain_* tool contract.

v1.2.0 Changelog — Myco Brain