Browse documentation
Getting started
Operations
Talk to our docs assistant
Ask anything about CRBRL — architecture, codecs, search modes, or migration.
Introduction
CRBRL is a compression-native, disk-persistent vector database for AI memory. Embeddings are stored compressed from the moment they arrive — roughly 8× fewer bytes per vector — and queries run directly against the compressed data, so there is no decompress step on the hot path.
A 1,536-dimension float32 vector is 6,144 bytes at full precision. CRBRL stores it in about 768 bytes. At 100 million vectors of 1,536 dimensions, that is the difference between 614 GB and 77 GB on disk, across every copy and backup.
The result stays effectively identical: about 0.98 cosine fidelity against full precision. There is no model to retrain and no separate index to rebuild — compression applies to the first record you load. CRBRL is built around peer-reviewed mathematics: TurboQuant, arXiv:2504.19874.
Storage is typically 55–80% of a vector-database bill. CRBRL reduces the data itself rather than moving it to slower tiers, so the largest line in the bill comes down without trading retrieval quality.
What you get
- Compressed-domain search — semantic, full-text, and hybrid.
- Two selectable codecs: TurboQuant and RaBitQ.
- Hot / warm / cold tiering across memory and disk.
- A Chroma-compatible API and a Postgres extension, crbrl-pg.
- Provider-neutral embeddings — works with 9 embedding models.
- Auth, audit logs, RBAC, multi-tenancy, and snapshots.
Install
CRBRL runs in two forms. Run it standalone as a server, or add the crbrl-pg extension to an existing PostgreSQL database when you want vectors to live alongside relational data.
Standalone
The standalone server exposes a Chroma-compatible HTTP API, so existing Chroma clients can connect with minimal changes.
# Pull and run the CRBRL server (illustrative)
crbrl serve --data-dir ./crbrl-data --port 8443
# Or point an existing Chroma client at the CRBRL endpoint
export CRBRL_URL=https://localhost:8443
Postgres extension (crbrl-pg)
Install the extension into a database and enable it once. Vectors are then stored compressed inside Postgres, and search runs in the compressed domain.
-- enable the extension (illustrative)
CREATE EXTENSION IF NOT EXISTS crbrl_pg;
Quickstart
Create a collection, add a few vectors, then query. The snippet below is generic pseudocode — names and arguments are illustrative, not a fixed API contract.
# Illustrative client usage — names are generic
client = crbrl.connect(url="https://localhost:8443")
# 1. create a collection, choosing a codec
col = client.create_collection(
name="docs",
dim=1536,
codec="turboquant", # or "rabitq"
)
# 2. add vectors (stored compressed on arrival)
col.add(
ids=["a", "b"],
embeddings=[vec_a, vec_b],
metadatas=[{"src": "guide"}, {"src": "faq"}],
)
# 3. query — search runs in the compressed domain
hits = col.query(embedding=query_vec, top_k=5)
No separate index build, no warm-up pass, and no decompress step before the search. The same call works on the first record and on the hundred-millionth.
Compression-native architecture
Most systems treat compression as an add-on: data is stored at full precision, and a separate layer compresses it for storage, then decompresses it before every search. That decompress step is why bolt-on compression was rarely adopted where it mattered.
CRBRL is compression-native. Vectors are encoded as they are written and held compressed end to end — on disk, in every copy, and in the working set. The query path is designed around the compressed representation rather than around full-precision floats, so the footprint stays small without a per-query unpack.
1,536-dim float32 = 6,144 B at full precision → 768 B with CRBRL. At 100M vectors, 614 GB → 77 GB.
Compressed-domain search
Compressed-domain search means the similarity computation operates on the compressed vectors directly. There is no step that restores full-precision floats before scoring candidates, so the cost saving on storage does not turn into a cost on every query.
This is what separates CRBRL from compress-then-decompress approaches. The data stays in its compact form through the read path, and retrieval quality holds at about 0.98 cosine fidelity against full precision.
Codecs
The codec layer is selectable. CRBRL ships two codecs, chosen per collection:
- TurboQuant — the peer-reviewed method CRBRL is built around (arXiv:2504.19874). Quantization with no training step.
- RaBitQ — an alternative binary quantization codec, also selectable at collection creation.
Both target the same goal — a compact representation that supports compressed-domain search. You pick the codec when you create a collection; see Choosing a codec.
Tiering
CRBRL places data across hot, warm, and cold tiers. Hot data stays close to the query path for the fastest reads; warm and cold data sit on progressively cheaper storage while remaining searchable.
Because vectors are already compressed, each tier holds far more per byte than a full-precision store would. Tiering complements compression here rather than replacing it — see Tiering policy in Operations.
Fidelity
Fidelity is measured as cosine similarity between results returned from compressed vectors and results from full-precision vectors. CRBRL holds at about 0.98 cosine fidelity — in practice, the ranked results match what a full-precision store returns.
This is the trade that makes the 8× footprint reduction usable: the saving is on bytes, not on answer quality.
Migrating from Chroma
CRBRL exposes a Chroma-compatible API. In many cases migration is repointing the client at the CRBRL endpoint and re-adding your collections; existing query and add calls keep their shape.
- Point your Chroma client at the CRBRL URL.
- Recreate collections, choosing a codec (TurboQuant or RaBitQ).
- Re-add embeddings — they are stored compressed on arrival.
- Verify retrieval; results hold at ≈0.98 cosine fidelity.
# Illustrative: same client shape, new endpoint
client = crbrl.connect(url="https://localhost:8443")
col = client.create_collection(name="docs", dim=1536, codec="turboquant")
col.add(ids=ids, embeddings=embeddings, metadatas=metas)
Using the Postgres extension
With crbrl-pg, vectors live inside PostgreSQL alongside your relational data, compressed, and searchable in the compressed domain. This suits teams that already run Postgres and want one system of record.
-- Illustrative crbrl-pg usage
CREATE EXTENSION IF NOT EXISTS crbrl_pg;
CREATE TABLE docs (
id text PRIMARY KEY,
body text,
emb crbrl_vector(1536) -- stored compressed
);
-- nearest-neighbour query in the compressed domain
SELECT id FROM docs ORDER BY emb <-> :query LIMIT 5;
Choosing a codec
Both shipped codecs are selectable per collection and target the same outcome — a compact representation that supports compressed-domain search at high fidelity. The choice is set at collection creation.
| Codec | Notes |
|---|---|
| turboquant | Peer-reviewed method CRBRL is built around (arXiv:2504.19874). No training step. |
| rabitq | Alternative binary quantization codec, selectable at collection creation. |
Neither codec requires training. If you are unsure, start with the default for your workload and compare fidelity on your own queries.
Search modes
CRBRL supports three retrieval modes, all running against compressed data:
| Mode | Description |
|---|---|
| semantic | Nearest-neighbour search over embeddings — meaning-based retrieval. |
| full-text | Keyword / lexical search over text fields. |
| hybrid | Combines semantic and full-text signals in one query. |
Embedding models
CRBRL is provider-neutral. It works with 9 embedding models across providers — you bring the embeddings, and CRBRL stores and searches them compressed. There is no lock-in to a single embedding provider.
Because compression applies to whatever vectors you load, switching embedding models is a matter of re-adding vectors to a collection, not re-architecting the store.
Configuration overview
Key settings are illustrative and grouped by concern. Exact keys depend on your deployment form (standalone vs. crbrl-pg).
| Setting | Purpose |
|---|---|
| codec | Per-collection codec: turboquant or rabitq. |
| dim | Vector dimensionality, e.g. 1,536. |
| tiering | Hot / warm / cold placement policy. |
| search_mode | semantic, full-text, or hybrid. |
Tiering policy
A tiering policy decides which data sits in the hot, warm, or cold tier. Frequently queried vectors stay hot for the fastest reads; colder data moves to cheaper storage while remaining searchable.
Since vectors are compressed, each tier holds far more per byte — the policy controls latency and cost placement, while compression keeps the absolute footprint small across all tiers.
Observability & snapshots
CRBRL supports snapshots for point-in-time copies of a collection, used for backup and restore. Because the underlying data is already 8× smaller, snapshots and the copies around them carry the same footprint reduction — every copy is compressed, not just the primary.
Pair snapshots with your existing monitoring to track collection size, query volume, and tier distribution over time.
Governance
CRBRL includes the controls production deployments expect:
- Auth — authentication on access to collections and the API.
- Audit logs — a record of access and changes.
- RBAC — role-based access control over operations.
- Multi-tenancy — isolation between tenants in a shared deployment.
- Snapshots — point-in-time copies for backup and restore.
Where to next
Technology
How compression-native storage and compressed-domain search work, and the peer-reviewed mathematics underneath.
Read the technology →Market & use cases
Who runs into the storage wall, and the workloads where an 8× footprint reduction changes what ships.
See use cases →