CRBRL — Documentation · Compression-native vector database

Browse documentation

Ask the docs

Talk to our docs assistant

Ask anything about CRBRL — architecture, codecs, search modes, or migration.

🎙 Docs assistantConnect the ElevenLabs agent to go live.

Getting started

Introduction

CRBRL is a compression-native, disk-persistent vector database for AI memory. Embeddings are stored compressed from the moment they arrive — roughly 8× fewer bytes per vector — and queries run directly against the compressed data, so there is no decompress step on the hot path.

A 1,536-dimension float32 vector is 6,144 bytes at full precision. CRBRL stores it in about 768 bytes. At 100 million vectors of 1,536 dimensions, that is the difference between 614 GB and 77 GB on disk, across every copy and backup.

The result stays effectively identical: about 0.98 cosine fidelity against full precision. There is no model to retrain and no separate index to rebuild — compression applies to the first record you load. CRBRL is built around peer-reviewed mathematics: TurboQuant, arXiv:2504.19874.

Why it matters

Storage is typically 55–80% of a vector-database bill. CRBRL reduces the data itself rather than moving it to slower tiers, so the largest line in the bill comes down without trading retrieval quality.

What you get

Compressed-domain search — semantic, full-text, and hybrid.
Two selectable codecs: TurboQuant and RaBitQ.
Hot / warm / cold tiering across memory and disk.
A Chroma-compatible API and a Postgres extension, crbrl-pg.
Provider-neutral embeddings — works with 9 embedding models.
Auth, audit logs, RBAC, multi-tenancy, and snapshots.

Getting started

Install

CRBRL runs in two forms. Run it standalone as a server, or add the crbrl-pg extension to an existing PostgreSQL database when you want vectors to live alongside relational data.

Standalone

The standalone server exposes a Chroma-compatible HTTP API, so existing Chroma clients can connect with minimal changes.

shellillustrative

# Pull and run the CRBRL server (illustrative)
crbrl serve --data-dir ./crbrl-data --port 8443

# Or point an existing Chroma client at the CRBRL endpoint
export CRBRL_URL=https://localhost:8443

Postgres extension (crbrl-pg)

Install the extension into a database and enable it once. Vectors are then stored compressed inside Postgres, and search runs in the compressed domain.

sqlillustrative

-- enable the extension (illustrative)
CREATE EXTENSION IF NOT EXISTS crbrl_pg;

Getting started

Quickstart

Create a collection, add a few vectors, then query. The snippet below is generic pseudocode — names and arguments are illustrative, not a fixed API contract.

python · pseudocodeillustrative

# Illustrative client usage — names are generic
client = crbrl.connect(url="https://localhost:8443")

# 1. create a collection, choosing a codec
col = client.create_collection(
    name="docs",
    dim=1536,
    codec="turboquant",   # or "rabitq"
)

# 2. add vectors (stored compressed on arrival)
col.add(
    ids=["a", "b"],
    embeddings=[vec_a, vec_b],
    metadatas=[{"src": "guide"}, {"src": "faq"}],
)

# 3. query — search runs in the compressed domain
hits = col.query(embedding=query_vec, top_k=5)

No separate index build, no warm-up pass, and no decompress step before the search. The same call works on the first record and on the hundred-millionth.

Core concepts

Compression-native architecture

Most systems treat compression as an add-on: data is stored at full precision, and a separate layer compresses it for storage, then decompresses it before every search. That decompress step is why bolt-on compression was rarely adopted where it mattered.

CRBRL is compression-native. Vectors are encoded as they are written and held compressed end to end — on disk, in every copy, and in the working set. The query path is designed around the compressed representation rather than around full-precision floats, so the footprint stays small without a per-query unpack.

Footprint, exactly

1,536-dim float32 = 6,144 B at full precision → 768 B with CRBRL. At 100M vectors, 614 GB → 77 GB.

Core concepts

Compressed-domain search

Compressed-domain search means the similarity computation operates on the compressed vectors directly. There is no step that restores full-precision floats before scoring candidates, so the cost saving on storage does not turn into a cost on every query.

This is what separates CRBRL from compress-then-decompress approaches. The data stays in its compact form through the read path, and retrieval quality holds at about 0.98 cosine fidelity against full precision.

Core concepts

Codecs

The codec layer is selectable. CRBRL ships two codecs, chosen per collection:

TurboQuant — the peer-reviewed method CRBRL is built around (arXiv:2504.19874). Quantization with no training step.
RaBitQ — an alternative binary quantization codec, also selectable at collection creation.

Both target the same goal — a compact representation that supports compressed-domain search. You pick the codec when you create a collection; see Choosing a codec.

Core concepts

Tiering

CRBRL places data across hot, warm, and cold tiers. Hot data stays close to the query path for the fastest reads; warm and cold data sit on progressively cheaper storage while remaining searchable.

Because vectors are already compressed, each tier holds far more per byte than a full-precision store would. Tiering complements compression here rather than replacing it — see Tiering policy in Operations.

Core concepts

Fidelity

Fidelity is measured as cosine similarity between results returned from compressed vectors and results from full-precision vectors. CRBRL holds at about 0.98 cosine fidelity — in practice, the ranked results match what a full-precision store returns.

This is the trade that makes the 8× footprint reduction usable: the saving is on bytes, not on answer quality.

Guides

Migrating from Chroma

CRBRL exposes a Chroma-compatible API. In many cases migration is repointing the client at the CRBRL endpoint and re-adding your collections; existing query and add calls keep their shape.

Point your Chroma client at the CRBRL URL.
Recreate collections, choosing a codec (TurboQuant or RaBitQ).
Re-add embeddings — they are stored compressed on arrival.
Verify retrieval; results hold at ≈0.98 cosine fidelity.

python · pseudocodeillustrative

# Illustrative: same client shape, new endpoint
client = crbrl.connect(url="https://localhost:8443")
col = client.create_collection(name="docs", dim=1536, codec="turboquant")
col.add(ids=ids, embeddings=embeddings, metadatas=metas)

Guides

Using the Postgres extension

With crbrl-pg, vectors live inside PostgreSQL alongside your relational data, compressed, and searchable in the compressed domain. This suits teams that already run Postgres and want one system of record.

sql · pseudocodeillustrative

-- Illustrative crbrl-pg usage
CREATE EXTENSION IF NOT EXISTS crbrl_pg;

CREATE TABLE docs (
  id   text PRIMARY KEY,
  body text,
  emb  crbrl_vector(1536)   -- stored compressed
);

-- nearest-neighbour query in the compressed domain
SELECT id FROM docs ORDER BY emb <-> :query LIMIT 5;

Guides

Choosing a codec

Both shipped codecs are selectable per collection and target the same outcome — a compact representation that supports compressed-domain search at high fidelity. The choice is set at collection creation.

Codec	Notes
turboquant	Peer-reviewed method CRBRL is built around (arXiv:2504.19874). No training step.
rabitq	Alternative binary quantization codec, selectable at collection creation.

Neither codec requires training. If you are unsure, start with the default for your workload and compare fidelity on your own queries.

Reference

Search modes

CRBRL supports three retrieval modes, all running against compressed data:

Mode	Description
semantic	Nearest-neighbour search over embeddings — meaning-based retrieval.
full-text	Keyword / lexical search over text fields.
hybrid	Combines semantic and full-text signals in one query.

Reference

Embedding models

CRBRL is provider-neutral. It works with 9 embedding models across providers — you bring the embeddings, and CRBRL stores and searches them compressed. There is no lock-in to a single embedding provider.

Because compression applies to whatever vectors you load, switching embedding models is a matter of re-adding vectors to a collection, not re-architecting the store.

Reference

Configuration overview

Key settings are illustrative and grouped by concern. Exact keys depend on your deployment form (standalone vs. crbrl-pg).

Setting	Purpose
codec	Per-collection codec: turboquant or rabitq.
dim	Vector dimensionality, e.g. 1,536.
tiering	Hot / warm / cold placement policy.
search_mode	semantic, full-text, or hybrid.

Operations

Tiering policy

A tiering policy decides which data sits in the hot, warm, or cold tier. Frequently queried vectors stay hot for the fastest reads; colder data moves to cheaper storage while remaining searchable.

Since vectors are compressed, each tier holds far more per byte — the policy controls latency and cost placement, while compression keeps the absolute footprint small across all tiers.

Operations

Observability & snapshots

CRBRL supports snapshots for point-in-time copies of a collection, used for backup and restore. Because the underlying data is already 8× smaller, snapshots and the copies around them carry the same footprint reduction — every copy is compressed, not just the primary.

Pair snapshots with your existing monitoring to track collection size, query volume, and tier distribution over time.

Operations

Governance

CRBRL includes the controls production deployments expect:

Auth — authentication on access to collections and the API.
Audit logs — a record of access and changes.
RBAC — role-based access control over operations.
Multi-tenancy — isolation between tenants in a shared deployment.
Snapshots — point-in-time copies for backup and restore.

Keep reading

Where to next

The method

Technology

How compression-native storage and compressed-domain search work, and the peer-reviewed mathematics underneath.

Read the technology →

Where it fits

Market & use cases

Who runs into the storage wall, and the workloads where an 8× footprint reduction changes what ships.

See use cases →

CRBRL documentation

Getting started

Core concepts

Guides

Reference

Operations

Talk to our docs assistant

Introduction

What you get

Install

Standalone

Postgres extension (crbrl-pg)

Quickstart

Compression-native architecture

Compressed-domain search

Codecs

Tiering

Fidelity

Migrating from Chroma

Using the Postgres extension

Choosing a codec

Search modes

Embedding models

Configuration overview

Tiering policy

Observability & snapshots

Governance

Where to next

Technology

Market & use cases