HA5H | Crystal Memory for AI Agents

Write. Recall. Resume.

Crystallize

Feed a memory into the crystal. HA5H computes a 64-bit fingerprint, extracts entities, compresses into an inclusion, and indexes across six facets. 3.8ms. No LLM call. No API key. Your data stays local.

Recall

Query the crystal and it lights up everything relevant. FTS5 keyword search and SimHash band lookup run in parallel, then hybrid scoring ranks results. 9ms at 5,000 memories. Every result is the original, unmodified text.

Wake up

One call generates ~139 tokens of startup context. Identity line plus your top critical memories. Paste into any agent's system prompt. The crystal remembers so the agent doesn't have to.

A crystal grows itself.

The beauty of a crystal isn't the count of its faces. It's whether light passes through cleanly.

Every memory you store gets a fingerprint: a 64-bit SimHash computed from the text itself. Similar memories produce similar fingerprints automatically. No one decides where anything goes. No taxonomy. No filing.

The crystal has six faces you can look through. Rotate it: see memories by content similarity. Rotate again: see them by when they were true. Again: by who was involved, by importance, by origin, by meaning. Same crystal, six views. Each facet is independently indexed, so any query finds its answer through whichever face catches the light first.

As memories accumulate, the crystal grows. Similar memories cluster in the lattice. Contradictions are detected and the older fact is retired. Growth rings mark temporal epochs: sprint boundaries, project phases, context compression events. The structure emerges from the data, never imposed on it.

The 5 in HA5H is the five-fold quasicrystalline symmetry, the "impossible" structure Dan Shechtman discovered in 1982. The first five facets use SimHash fingerprinting and keyword indexing. The sixth facet adds semantic embeddings: 384-dimension sentence vectors that catch what keywords can't. "Auth provider" finds "Clerk" because the model understands they mean the same thing. Install with pip install ha5h[embeddings] or leave it off. Five facets still work on their own.

Identity + manifest
~50 tokens

Top-5 salience inclusions
~90 tokens

Facet query results
On demand

93.6% Retrieval Recall

500 questions. Zero API calls. Zero LLM. Pure SimHash + FTS5 retrieval from a single SQLite file. Benchmark script included in the repo.

0 R@10 session retrieval

0 ms per recall

0 ms per write

0 API calls required

By Question Type (R@10)

Single-session assistant 100% Single-session preference 100% Temporal reasoning 94.0% Knowledge update 93.6% Multi-session reasoning 91.0% Single-session user 90.0%

End-to-End QA (GPT-4o judge)

Category v1 v5 Δ Temporal reasoning 28.6% 42.1% +13.5 Single-session preference 30.0% 43.3% +13.3 Multi-session 39.1% 43.6% +4.5 Single-session user 77.1% 78.6% +1.5 Knowledge update 79.5% 80.8% +1.3 Single-session assistant 80.4% 83.9% +3.5

Overall: 52% → 58.4% · Full analysis

Tested on LongMemEval-S (500 questions, ~115K tokens of conversation history per question). Session-level retrieval: did the correct evidence session appear in the top 10 results? End-to-end QA: Claude generates answers from retrieved memories, GPT-4o judges correctness. Reproducible: python benchmarks/bench_longmemeval.py

Quick Start

$ pip install ha5h

$ ha5h init .

Crystal initialized at .ha5h

# Store decisions, context, anything your agent should remember

$ ha5h crystallize "Switched the API from REST to GraphQL for performance" -s 4

Crystallized [e7f3a1b2] ★★★★

$ ha5h crystallize "Team agreed to use PostgreSQL for analytics" -s 5

Crystallized [c4d8e2f6] ★★★★★

# Recall by meaning, not exact keywords

$ ha5h recall "what database did we choose" --explain

Found 1 memories score:0.87

Found by: keyword + semantic

★★★★★ Team agreed to use PostgreSQL for analytics

# Generate startup context for any agent (~139 tokens)

$ ha5h wake-up

[L0:IDENTITY] HA5H crystal | 2 active memories | 1 entity

[L1:CRITICAL] ★★★★★ Team agreed to use PostgreSQL for analytics

[L1:CRITICAL] ★★★★ Switched the API from REST to GraphQL

MCP Integration

claude mcp add ha5h -- python -m ha5h.mcp.server

ha5h_crystallizeStore a new memory

ha5h_recallSearch across all 6 facets

ha5h_invalidateMark memory as no longer valid

ha5h_lattice_walkTraverse memory connections

ha5h_wake_upGenerate startup context

ha5h_statsCrystal statistics

HA5H Crystal Clear Memory for AI Agents

Write. Recall. Resume.

Crystallize

Recall

Wake up

A crystal grows itself.

Six Indexed Facets

Content

Temporal

Relational

Salience

Context

Semantic

93.6% Retrieval Recall

Built for teams that can't send data to the cloud.

Privileged matter memory

On-device patient memory

Client & deal memory

Quick Start

MCP Integration