Vector Store

Billion-scale embedding retrieval at sub-millisecond latency.

Purpose-built vector database for production RAG and semantic search. Hybrid BM25 + dense retrieval, GPU-accelerated HNSW indexing, real-time upserts, and a query API designed to feel as simple as a SQL SELECT statement.

Join the Waitlist View Roadmap

Capabilities

Everything you need, nothing you don't.

GPU-accelerated HNSW

Index construction and nearest-neighbour search are GPU-accelerated using cuVS (RAPIDS). Build a 100M-vector HNSW index in minutes, not hours.

Hybrid BM25 + dense search

Combine sparse keyword matching with dense semantic similarity in a single query. Reciprocal rank fusion merges results with configurable blend weights.

Real-time upserts

Vectors are queryable within milliseconds of insertion — no batch ingestion jobs, no index rebuild downtime. Designed for live document pipelines.

Metadata filtering

Attach arbitrary JSON metadata to every vector. Pre-filter by metadata before the ANN search to dramatically reduce search space and improve relevance.

Multi-tenancy

Namespace isolation allows one deployment to serve thousands of tenants with strict data separation. No cross-tenant data leakage at any layer.

Serverless & dedicated tiers

Start with the serverless tier (pay per query) for development. Migrate to dedicated nodes for predictable latency and higher QPS guarantees in production.

Technical Specifications

Under the hood.

Index type	HNSW (GPU-accelerated via cuVS)
Scale	Billion-vector per namespace
Query latency	< 1 ms (p99, dedicated tier)
Hybrid search	BM25 + dense, RRF fusion
Real-time upserts	Yes, queryable in < 10 ms
Embedding dimensions	Up to 65,536
Metadata filtering	JSON, pre-filter before ANN
Multi-tenancy	Namespace-level isolation

Vector Store is currently planned — estimated Q3 2026.

No pricing yet. We offer tailored solutions only.

Get notified at launch

Platform in development

Be first to
shape the future.

CogniCloud is in active development. Join the waitlist to get early access and stay updated on our roadmap. No pricing yet — we'll work with each team to find the right fit.

No spam. No pricing pitches. We reach out personally to discuss your use case.

GPU Compute

Inference APIs

Vector Search

Observability