The market & the fit
A plain-language guide to the market CRBRL serves: what a vector database is, how it differs from the databases that came before it, and where a compression-native one pays off across real industries.
From the history of storage to a working use case in your sector.
For two years most companies kept AI in trials: small experiments, limited data, a handful of users. That phase is ending. Teams are now putting AI into the products and workflows that customers and staff rely on every day, which turns every document, conversation, and record into searchable memory.
Mass adoption means real scale, and at scale the cost of holding that memory grows faster than anything else. It is the line that climbs the wrong way as a project succeeds — and the one that decides which projects keep running.
// the storage share is precisely where CRBRL acts — it reduces the data itself, then keeps it fast
Each generation of database answered the question its era was asking. Vector databases are the latest step, built for a new kind of question: "what is similar in meaning?"
Data is organised into tables of rows and columns with a fixed schema. Queries are exact: find the row where the customer ID equals this value, or sum the orders in this date range. Built for structured records and precise, repeatable answers — still the backbone of most business systems.
As web applications grew, teams needed to store flexible, semi-structured data and scale it horizontally. Document stores keep records as JSON-like documents; key-value stores map a key directly to a blob. The trade was a looser schema and simpler lookups in exchange for scale and developer speed.
To search large bodies of text, search engines build an inverted index: a map from each word to the documents that contain it. This makes full-text and keyword search fast. It matches the words you typed — but it does not understand that "car" and "automobile" mean the same thing.
AI models turn text, images, and audio into embeddings: lists of numbers that place items with similar meaning near each other. A vector database stores those embeddings and searches by closeness, so it can find results that mean the same thing even when the words differ. This is the layer behind semantic search, recommendations, and AI memory.
An AI model reads a piece of content and produces an embedding: a long list of numbers — a high-dimensional vector — that captures its meaning. Items that mean similar things land close together in that space; items that mean different things land far apart. A common embedding has 1,536 dimensions.
To answer a question, the database embeds the query the same way and looks for the stored vectors nearest to it. This is nearest-neighbour search. At scale it is usually done as approximate-nearest-neighbour (ANN) search, which trades a tiny amount of exactness for a large gain in speed — close enough, far faster.
Each document, message, or image becomes a vector that encodes its meaning. This is the memory the AI will later search.
The database keeps every vector, organised so that similar ones can be found quickly without scanning everything.
A query becomes a vector too; the database returns the nearest matches by meaning — the foundation of semantic search and AI memory.
A relational database and a vector database answer different questions. They often sit side by side — one for the facts, one for the meaning.
A graph database stores entities as nodes and the relationships between them as edges — "this person works at that company," "this account paid that account." Its strength is traversal: following those edges to answer questions about paths, connections, and the shape of a network.
Where a vector database asks "what is similar?", a graph database asks "what is connected, and how?" The two are not competitors so much as complements — they are good at different questions.
Following chains of edges quickly: who is connected to whom, the shortest path between two entities, which records sit two or three hops away from a starting point.
Knowledge graphs, fraud rings detected by shared connections, recommendations driven by relationships, and network or dependency analysis — wherever the link itself is the answer.
// a graph database is the right tool when the relationship is explicit and known; a vector database is right when the relevant similarity is implicit in the content
Many real questions need both kinds of retrieval. Vector search finds the passages that are relevant by meaning; a graph supplies the explicit relationships that connect them — which document cites which, which account is linked to which, which step depends on which.
Combining them grounds an answer in two ways at once: it is on-topic because it is semantically similar, and it is well-supported because the relationships hold up. That is the basis of hybrid retrieval — and CRBRL ships vector, graph, and hybrid search so teams can use the right one, or both.
Vector similarity surfaces the content that matches the question's meaning, even when the wording differs.
The graph adds explicit links between those results — citations, ownership, dependencies — so the answer is traceable.
Hybrid retrieval blends both, giving answers that are both on-topic and supported by real, checkable relationships.
A codec is the method that compresses a vector down to a smaller form while keeping it searchable. CRBRL ships two, and the operator chooses per workload.
// why both: real workloads differ — having a selectable codec means operator choice and steadier results across the spread of data, rather than one fixed trade-off for everyone
The full feature set, paired with the plain benefit each one delivers.
Choose an industry to see a typical workload, the problem it runs into, and what changes with CRBRL.
Bring your data sizes and retention needs. We will walk through what compression-native storage does to your specific AI-memory bill, and where it fits in your stack.
Discuss your workload →The codecs, the compressed-domain search, the Chroma-compatible API and crbrl-pg extension — the mechanics behind the figures on this page.
See the technology →