Everything you need to build, fine-tune, and serve AI models on CogniCloud — from first deployment to production scale.
Getting Started
Install the SDK, create an API key, and deploy any open-source model. The API is OpenAI-compatible — if you've used GPT-4, you already know CogniCloud.
# pip install cognicloud
import cognicloud as cogni
cogni.api_key = "ck_live_..."
# Deploy any HuggingFace model
deployment = cogni.Inference.create(
model="meta-llama/Llama-3-70B-Instruct",
hardware="high-perf",
)
# OpenAI-compatible chat
response = cogni.chat.completions.create(
model="meta-llama/Llama-3-70B-Instruct",
messages=[{"role": "user", "content": "Hello!"}],
stream=True,
)
for chunk in response:
print(chunk.choices[0].delta.content, end="")What goes where
Our documentation is split by audience and task. Use this guide to find the right section for what you're trying to do.
First-time setup: API keys, SDK install, CLI config, and your first deployment. Start here if you're new.
Architecture, instance types, networking, storage, regions. Use this to understand how CogniCloud works.
GPU Compute, Inference Gateway, Neural Cache, Vector Store, Training Jobs, Global Edge. One section per product.
REST endpoints, authentication, Python/Node SDKs, CLI commands. For integrating CogniCloud into your stack.
Data residency, SOC 2, HIPAA, VPC peering, RBAC. For security reviews and enterprise procurement.
Step-by-step guides: fine-tune Llama, build RAG, migrate from OpenAI. Hands-on walkthroughs.
Documentation Structure
Full reference for every API, SDK, CLI command, and concept. All sections below are being written — sign up to get notified when each section goes live.
We'll email you as each section ships — starting with Getting Started and API Reference.
Documentation by product