Documentation · v-current

CRBRL documentation

CRBRL is a compression-native, disk-persistent vector database. Store the same AI memory in 8× fewer bytes per vector, search it while it stays compressed, and keep effectively identical results. Start below, or ask the docs assistant on the right.

Browse documentation
Ask the docs

Talk to our docs assistant

Ask anything about CRBRL — architecture, codecs, search modes, or migration.

🎙 Docs assistantConnect the ElevenLabs agent to go live.
Getting started

Introduction

CRBRL is a compression-native, disk-persistent vector database for AI memory. Embeddings are stored compressed from the moment they arrive — roughly 8× fewer bytes per vector — and queries run directly against the compressed data, so there is no decompress step on the hot path.

A 1,536-dimension float32 vector is 6,144 bytes at full precision. CRBRL stores it in about 768 bytes. At 100 million vectors of 1,536 dimensions, that is the difference between 614 GB and 77 GB on disk, across every copy and backup.

The result stays effectively identical: about 0.98 cosine fidelity against full precision. There is no model to retrain and no separate index to rebuild — compression applies to the first record you load. CRBRL is built around peer-reviewed mathematics: TurboQuant, arXiv:2504.19874.

Why it matters

Storage is typically 55–80% of a vector-database bill. CRBRL reduces the data itself rather than moving it to slower tiers, so the largest line in the bill comes down without trading retrieval quality.

What you get

  • Compressed-domain search — semantic, full-text, and hybrid.
  • Two selectable codecs: TurboQuant and RaBitQ.
  • Hot / warm / cold tiering across memory and disk.
  • A Chroma-compatible API and a Postgres extension, crbrl-pg.
  • Provider-neutral embeddings — works with 9 embedding models.
  • Auth, audit logs, RBAC, multi-tenancy, and snapshots.
Getting started

Install

CRBRL runs in two forms. Run it standalone as a server, or add the crbrl-pg extension to an existing PostgreSQL database when you want vectors to live alongside relational data.

Standalone

The standalone server exposes a Chroma-compatible HTTP API, so existing Chroma clients can connect with minimal changes.

shellillustrative
# Pull and run the CRBRL server (illustrative)
crbrl serve --data-dir ./crbrl-data --port 8443

# Or point an existing Chroma client at the CRBRL endpoint
export CRBRL_URL=https://localhost:8443

Postgres extension (crbrl-pg)

Install the extension into a database and enable it once. Vectors are then stored compressed inside Postgres, and search runs in the compressed domain.

sqlillustrative
-- enable the extension (illustrative)
CREATE EXTENSION IF NOT EXISTS crbrl_pg;
Getting started

Quickstart

Create a collection, add a few vectors, then query. The snippet below is generic pseudocode — names and arguments are illustrative, not a fixed API contract.

python · pseudocodeillustrative
# Illustrative client usage — names are generic
client = crbrl.connect(url="https://localhost:8443")

# 1. create a collection, choosing a codec
col = client.create_collection(
    name="docs",
    dim=1536,
    codec="turboquant",   # or "rabitq"
)

# 2. add vectors (stored compressed on arrival)
col.add(
    ids=["a", "b"],
    embeddings=[vec_a, vec_b],
    metadatas=[{"src": "guide"}, {"src": "faq"}],
)

# 3. query — search runs in the compressed domain
hits = col.query(embedding=query_vec, top_k=5)

No separate index build, no warm-up pass, and no decompress step before the search. The same call works on the first record and on the hundred-millionth.

Core concepts

Compression-native architecture

Most systems treat compression as an add-on: data is stored at full precision, and a separate layer compresses it for storage, then decompresses it before every search. That decompress step is why bolt-on compression was rarely adopted where it mattered.

CRBRL is compression-native. Vectors are encoded as they are written and held compressed end to end — on disk, in every copy, and in the working set. The query path is designed around the compressed representation rather than around full-precision floats, so the footprint stays small without a per-query unpack.

Footprint, exactly

1,536-dim float32 = 6,144 B at full precision → 768 B with CRBRL. At 100M vectors, 614 GB → 77 GB.

Core concepts

Codecs

The codec layer is selectable. CRBRL ships two codecs, chosen per collection:

  • TurboQuant — the peer-reviewed method CRBRL is built around (arXiv:2504.19874). Quantization with no training step.
  • RaBitQ — an alternative binary quantization codec, also selectable at collection creation.

Both target the same goal — a compact representation that supports compressed-domain search. You pick the codec when you create a collection; see Choosing a codec.

Core concepts

Tiering

CRBRL places data across hot, warm, and cold tiers. Hot data stays close to the query path for the fastest reads; warm and cold data sit on progressively cheaper storage while remaining searchable.

Because vectors are already compressed, each tier holds far more per byte than a full-precision store would. Tiering complements compression here rather than replacing it — see Tiering policy in Operations.

Core concepts

Fidelity

Fidelity is measured as cosine similarity between results returned from compressed vectors and results from full-precision vectors. CRBRL holds at about 0.98 cosine fidelity — in practice, the ranked results match what a full-precision store returns.

This is the trade that makes the 8× footprint reduction usable: the saving is on bytes, not on answer quality.

Guides

Migrating from Chroma

CRBRL exposes a Chroma-compatible API. In many cases migration is repointing the client at the CRBRL endpoint and re-adding your collections; existing query and add calls keep their shape.

  1. Point your Chroma client at the CRBRL URL.
  2. Recreate collections, choosing a codec (TurboQuant or RaBitQ).
  3. Re-add embeddings — they are stored compressed on arrival.
  4. Verify retrieval; results hold at ≈0.98 cosine fidelity.
python · pseudocodeillustrative
# Illustrative: same client shape, new endpoint
client = crbrl.connect(url="https://localhost:8443")
col = client.create_collection(name="docs", dim=1536, codec="turboquant")
col.add(ids=ids, embeddings=embeddings, metadatas=metas)
Guides

Using the Postgres extension

With crbrl-pg, vectors live inside PostgreSQL alongside your relational data, compressed, and searchable in the compressed domain. This suits teams that already run Postgres and want one system of record.

sql · pseudocodeillustrative
-- Illustrative crbrl-pg usage
CREATE EXTENSION IF NOT EXISTS crbrl_pg;

CREATE TABLE docs (
  id   text PRIMARY KEY,
  body text,
  emb  crbrl_vector(1536)   -- stored compressed
);

-- nearest-neighbour query in the compressed domain
SELECT id FROM docs ORDER BY emb <-> :query LIMIT 5;
Guides

Choosing a codec

Both shipped codecs are selectable per collection and target the same outcome — a compact representation that supports compressed-domain search at high fidelity. The choice is set at collection creation.

CodecNotes
turboquantPeer-reviewed method CRBRL is built around (arXiv:2504.19874). No training step.
rabitqAlternative binary quantization codec, selectable at collection creation.

Neither codec requires training. If you are unsure, start with the default for your workload and compare fidelity on your own queries.

Reference

Search modes

CRBRL supports three retrieval modes, all running against compressed data:

ModeDescription
semanticNearest-neighbour search over embeddings — meaning-based retrieval.
full-textKeyword / lexical search over text fields.
hybridCombines semantic and full-text signals in one query.
Reference

Embedding models

CRBRL is provider-neutral. It works with 9 embedding models across providers — you bring the embeddings, and CRBRL stores and searches them compressed. There is no lock-in to a single embedding provider.

Because compression applies to whatever vectors you load, switching embedding models is a matter of re-adding vectors to a collection, not re-architecting the store.

Reference

Configuration overview

Key settings are illustrative and grouped by concern. Exact keys depend on your deployment form (standalone vs. crbrl-pg).

SettingPurpose
codecPer-collection codec: turboquant or rabitq.
dimVector dimensionality, e.g. 1,536.
tieringHot / warm / cold placement policy.
search_modesemantic, full-text, or hybrid.
Operations

Tiering policy

A tiering policy decides which data sits in the hot, warm, or cold tier. Frequently queried vectors stay hot for the fastest reads; colder data moves to cheaper storage while remaining searchable.

Since vectors are compressed, each tier holds far more per byte — the policy controls latency and cost placement, while compression keeps the absolute footprint small across all tiers.

Operations

Observability & snapshots

CRBRL supports snapshots for point-in-time copies of a collection, used for backup and restore. Because the underlying data is already 8× smaller, snapshots and the copies around them carry the same footprint reduction — every copy is compressed, not just the primary.

Pair snapshots with your existing monitoring to track collection size, query volume, and tier distribution over time.

Operations

Governance

CRBRL includes the controls production deployments expect:

  • Auth — authentication on access to collections and the API.
  • Audit logs — a record of access and changes.
  • RBAC — role-based access control over operations.
  • Multi-tenancy — isolation between tenants in a shared deployment.
  • Snapshots — point-in-time copies for backup and restore.
Keep reading

Where to next

The method

Technology

How compression-native storage and compressed-domain search work, and the peer-reviewed mathematics underneath.

Read the technology →
Where it fits

Market & use cases

Who runs into the storage wall, and the workloads where an 8× footprint reduction changes what ships.

See use cases →