hiperbrain

hiperbrain documentation

A brain that computes with 10,000-dimensional vectors

Almost everything labelled "AI" today is one of two things: a giant neural network that is powerful but an opaque black box, or old-school symbolic logic that is transparent but brittle. hiperbrain is built on a long-overlooked third path - Hyperdimensional Computing (HDC), also called Vector Symbolic Architectures. It is transparent like logic, robust like a neural net, and it does something neither of them does gracefully: it thinks in pure algebra.

Every concept becomes a vector of 10,000 numbers. Learning a fact is one addition. Answering a question - even an analogy it was never explicitly taught - is a multiplication. Destroy a third of the numbers and it still remembers. And it all runs on a normal CPU, right in this browser.
On this page

The last decade of AI has been a story of two extremes. On one side, deep neural networks - astonishingly capable, but their knowledge is buried in billions of floating-point weights that no human can read. On the other, symbolic systems - rules and logic you can inspect line by line, but rigid and brittle the moment reality does not match the rule book.

Hyperdimensional Computing is the rarely-travelled middle road. It keeps the inspectability of symbols - every concept is a named vector you can print out - while gaining the noise-tolerance and graceful degradation of neural tissue. Crucially, it replaces training with algebra: to know something, you add a vector; to ask something, you multiply. There is no loss function, no backpropagation, no epochs.

This page is the long-form explanation of how that works, section by section. Every claim here maps to code you can read in packages/core/src/. Use the menu on the left to jump around, and expand only the parts you care about.

The single most counter-intuitive idea in HDC is that bigger is simpler. We represent each concept with a vector of D = 10,000 dimensions. Why so many? Because of the blessing of dimensionality.

In very high-dimensional space, two vectors picked at random are almost always near-orthogonal - their cosine similarity clusters tightly around zero. With 10,000 bipolar dimensions, the chance that two random concepts look alike by accident is astronomically small (the similarity of independent random vectors has a standard deviation of about 1/√D ≈ 0.01). In practice this means there is effectively unlimited room for distinct concepts that never collide.

  • Capacity: millions of distinct symbols can coexist before any two become confusable.
  • Robustness: meaning is spread across all dimensions, so losing some of them barely moves the vector.
  • Stability: small amounts of noise stay small - the geometry does not amplify errors the way low-dimensional spaces do.

Randomness is not a bug here - it is the entire source of capacity. We do not hand-design vectors; we let high-dimensional geometry do the work.

An atomic concept - a word, a symbol, a feature - is represented by a hypervector whose components are each either -1 or +1. This is the bipolar representation, and it is chosen deliberately:

  • Element-wise multiplication of two ±1 vectors is again a ±1 vector - the algebra stays closed and cheap.
  • A bipolar vector is its own inverse under multiplication (x ⊗ x = 1), which is what makes binding reversible.
  • Similarity is just a normalised dot product, computable with integer math - no floating point required for the core.

A single 10,000-dimensional bipolar vector packs into about 1.25 KBif stored as bits. The whole "mind" is a handful of these vectors plus a dictionary mapping names to them - small enough to ship to every browser.

The whole system rests on three operations. That is the entire instruction set of this brain:

  • bind (a ⊗ b) - element-wise multiply. The result is dissimilar to both inputs and is used to associate things, like a role with a value: capital ⊗ Tokyo. Binding is its own inverse, so you can "unbind" later to ask a question.
  • bundle ([a + b + c]) - an element-wise majority vote. It superimposes many vectors into one that stays similar to each input. This is how sets, records and memories are formed - many ideas living inside a single vector at once.
  • permute (ρ a) - a cyclic shift of the components. It makes a dissimilar vector while preserving structure, which lets us encode order, position and time.
OperationOutput similar to input?Reversible?
bindNo - becomes dissimilarYes - bind again to undo
bundleYes - stays similar to allNo - lossy superposition
permuteNo - becomes dissimilarYes - shift back

Bind, bundle, permute. From these three primitives - and nothing else - you can build records, sequences, graphs, and the reasoning below.

Combining many vectors produces a result that is only approximately equal to the clean originals. An item memory (also called cleanup memory) stores every known atomic vector and, given a noisy query, returns the closest one by cosine similarity.

This is the brain's moment of recognition: it takes a smeared, in-between vector that came out of some computation and snaps it back to the nearest real concept. Without cleanup, HDC would drift into noise after a few operations; with it, you can chain operations and still land on crisp, discrete answers.

query  = noisy vector (~70% correct)
cleanup(query) = argmax over all known
                 symbols of cosine(query, symbol)

The three operations compose into rich data structures - all living inside single fixed-size vectors:

  • Records (key-value): bind each field to its value and bundle them. [ name ⊗ Ada + born ⊗ 1815 ]. To read a field, unbind by its key and clean up: cleanup(record ⊗ name) -> Ada.
  • Sequences (order): permute each element by its position before bundling. [ ρ⁰a + ρ¹b + ρ²c ]. Permutation makes position matter, so abc and cba become different vectors.
  • Sets: just bundle the members. Order-independent, membership testable by similarity.
  • Graphs: bind node pairs to edge-type vectors and bundle them into one vector that represents an entire labelled graph.

Everything - a fact, a record, an ordered story, a knowledge graph - ends up as one vector of the same length. That uniformity is what makes the algebra so composable.

Here is where it stops feeling like a database and starts feeling like a mind. Encode each country as a record: a bundle of bound role/value pairs.

USA    = [ capital ⊗ Washington
         + currency ⊗ Dollar
         + language ⊗ English ]

Mexico = [ capital ⊗ MexicoCity
         + currency ⊗ Peso
         + language ⊗ Spanish ]

Now bind the two whole records together. Every aligned role cancels (because r ⊗ r = 1), leaving a single vector that swaps each country's fillers for the other's. Apply it to Dollar and clean up the result:

cleanup( Dollar ⊗ (USA ⊗ Mexico) )  ->  Peso

Nobody told the brain that "Dollar relates to Peso." It was never taught that pairing. The answer emerges from the algebra itself. This is live on the home page - just type USA is to Dollar as Mexico is to ? and watch it reason.

Representations are distributed and holographic: every concept is smeared across all 10,000 dimensions, so no single component is essential - exactly like a hologram, where each fragment still contains the whole image, just blurrier. Damage a large chunk of the vector and the memory still resolves correctly.

This is the graceful degradation seen in biological neural tissue, and it is nothing like a normal computer, where flipping a few bits corrupts the data entirely. Don't take our word for it - pick a concept and destroy up to half of its bits; the brain still recognises it:

original
corrupted - 30% of bits flipped
recovered concept
France OK
similarity to original: 0.400
cleanup score: 0.400 (next: Japan 0.023)

Even with a third of its 10,000 dimensions destroyed, the vector still points unmistakably at the right concept. No single bit matters - meaning is smeared across all of them.

There is no training loop, no gradient descent, no model to download. "Learning" a fact is literally one bundling step - a single addition into a memory vector. The brain knows it the instant you teach it.

This is closer to how a person remembers a new name on hearing it once than to how a neural network is trained over millions of examples. It also means there is no catastrophic forgetting in the usual sense: adding a new fact never overwrites an unrelated one, it just superimposes another faint layer onto the relevant memory.

Because bundling is commutative and order-independent, contributions from thousands of people fold into the same memory without any coordination. Every fact a visitor teaches is stored as a tiny text triple (subject, relation, object) and shared with everyone.

Each browser rebuilds the brain locally from those triples and streams new ones live as other people teach them. The vector math is identical for all - only the knowledge is shared, and it grows every minute. The brain you interact with is, quite literally, the sum of everyone who came before you.

Each relation keeps one bundled memory vector, and a bundle can only hold so much before recall gets fuzzy. So as a relation fills up, answers degrade gradually rather than failing all at once - the same capacity-limited, lossy behaviour as biological associative memory.

To keep recall sharp, hiperbrain buckets knowledge by relation: all capital-of facts share one vector, all currency-of facts another, and so on. Each memory therefore only ever competes with facts of the same kind, which keeps similarity scores clean and answers confident far longer than a single global bundle would allow.

The architecture is deliberately split so the thinking happens on your device and the server only ever stores plain text:

  • Client (your browser): all hypervector math - bind, bundle, permute, cleanup, analogy - runs here in TypeScript. The brain is rebuilt from the fact list every time it changes.
  • Server (Supabase / Postgres): stores the shared facts as text triples and broadcasts new ones over a realtime channel. It never stores or computes vectors.
  • Live sync: when anyone teaches a fact, it is written to Postgres and pushed to every connected browser, which folds it into its local brain instantly.
  • Moderation: every write passes server-side input validation, a content blocklist, per-IP rate limiting, duplicate detection and a global capacity cap before it is accepted.

Reads are cached briefly and paginated, so the shared brain can grow well beyond a single database page without the client ever noticing.

You drive the brain with one input box and a tiny, predictable grammar. There are three things you can say:

  • Teach - state a fact: Tokyo is the capital of Japan or capital of Japan is Tokyo.
  • Ask - query a relation: capital of Japan or what is the capital of Japan.
  • Analogy - reason across records: Japan is to Tokyo as France is to ?.

The parser is intentionally simple and transparent - it lives in lib/parse-command.ts - so you always know exactly how your words become vectors. No hidden interpretation, no language model guessing your intent.

HDC has decades of serious research behind it, but it has lived in academic papers and embedded-hardware labs - not as something the public can touch. Compare the paradigm to mainstream "AI":

Large language models
  • -Black box - billions of opaque weights
  • -Training costs millions; updates are slow
  • -Runs on GPU clusters, per-call API cost
  • -Hallucinates; reasoning is implicit
  • -Knowledge frozen at training time
hiperbrain (HDC)
  • -Transparent - read the whole engine yourself
  • -Learns a new fact in one step, instantly
  • -Runs on a plain CPU, in your browser, free
  • -Reasoning is explicit, inspectable algebra
  • -Knowledge grows live as people teach it

This is not magic and not an LLM. It does not understand language: it does not know that "capital" and "main city" mean the same thing, and it cannot chain steps on its own ("the capital of the country whose currency is the Yen").

It knows what it has been taught, and reasons over that with vectors. Recall is probabilistic, so a heavily loaded relation can occasionally return a near-miss. That honesty is the point - everything it does, you can see, measure and verify.

The foundations come from Pentti Kanerva's work on Hyperdimensional Computing and Sparse Distributed Memory, Tony Plate's Holographic Reduced Representations, and later energy-efficient HDC classifiers from Rahimi, Rabaey and others.

The implementation here lives in packages/core/src/ - published as @hiperbrain/core on npm - and is small enough to read end to end, with no hidden weights and no surprises.

Go teach the brain something

Is this a large language model?

No. There is no neural network and no training. Concepts are random ±1 vectors, and answers come from explicit vector algebra you can inspect.

Does it use my GPU or call an API?

Neither. All reasoning is integer-ish vector math that runs on your CPU in the browser. The server only stores and syncs text facts.

What happens to facts I teach?

They are stored as a plain (subject, relation, object) triple, moderated, and shared with everyone so the collective brain grows.

Why does it sometimes get an answer slightly wrong?

Recall is similarity-based. When a relation holds a lot of facts, the bundled memory gets crowded and the nearest match can be a near-miss - graceful degradation, by design.

Can I read the source?

Yes - the entire engine is a few small files in packages/core/src/, published as @hiperbrain/core on npm, and the project is open on GitHub.

Hypervector
A vector with thousands of dimensions (here 10,000), each component -1 or +1, used to represent one concept.
Bind (⊗)
Element-wise multiplication; associates two vectors into a new, dissimilar one. Its own inverse.
Bundle (+)
Element-wise majority vote; superimposes vectors into one that stays similar to all of them.
Permute (ρ)
A cyclic shift of components; encodes order and position.
Item / cleanup memory
A dictionary of known vectors that snaps a noisy query back to the nearest real concept.
Cosine similarity
A normalised dot product measuring how aligned two vectors are, from -1 to +1.
Blessing of dimensionality
The fact that random high-dimensional vectors are almost always near-orthogonal, giving enormous capacity.
Holographic representation
Information spread across all dimensions, so any fragment still contains (a blurrier version of) the whole.

The engine that powers this whole site is published as a tiny, dependency-free npm package - @hiperbrain/core. It is the exact same code that runs in the page you are reading: nothing is held back. Drop it into your own project and you get one-shot learning, analogy and fault-tolerant recall in a few kilobytes of math that runs in the browser, Node, Deno, Bun or at the edge.

npm install @hiperbrain/core

Build your own collective memory in four lines:

import { KnowledgeBrain } from "@hiperbrain/core";

const brain = new KnowledgeBrain();
brain.learn({ subject: "France", relation: "capital",  object: "Paris" });
brain.learn({ subject: "France", relation: "currency", object: "Euro" });
brain.learn({ subject: "Japan",  relation: "currency", object: "Yen" });

brain.ask("France", "capital");          // -> [{ name: "Paris", ... }]
brain.analogy("Yen", "Japan", "France"); // -> [{ name: "Euro", ... }]

Or reach for the raw primitives - bind, bundle, permute, cosineSimilarity - and the Brain facade for records, text classification and sequence memory. Everything is deterministic: the same input always yields the same vector, so results are reproducible and testable.

View on npmSource on GitHub