Documentation

Hyperspace Pods

A pod is the team primitive of Hyperspace. Identity at <slug>.hyperspace.sh, (optional) P2P GPU pool, (optional) BYOK $ treasury, and (optional) Hyperspace Go credit pool — four layers, each independently optional, all under one slug. Slack-style visibility. One auth. One bill.

Claim your pod How it works View live pod (Vega)

# Install curl -fsSL https://agents.hyper.space/cli | bash

4layers, each optional

$10/moteam pool, 10,200 reqs/5h

14curated open-weight models

v5.46CLI · mig 047 schema

Announcement

@varun_mathur · Apr 16, 2026

290K views 3K likes 3.7K bookmarks 391 reposts

Why Pods?

Free when local. A team of five paying for cloud AI burns $500–2,000/month on API calls. The same team's existing machines can serve competitive open models for free — the hardware is already on their desks. When a query genuinely needs a frontier model, the pod falls back to cloud at wholesale rates from a shared treasury.

No middleman. Prompts travel from your IDE to your pod members' hardware and back. No server in between reading your data. Pod state is replicated across your own machines using Raft consensus.

Automatic sharding. Tell the pod which model you want. It figures out how to split it across whatever hardware is online. No configuring layer ranges or calculating VRAM budgets.

Real NAT traversal. Friend behind a home router with a dynamic IP? Works. No VPN, no Tailscale, no port forwarding.

What is a Pod?

A pod is a private cluster of 2-10 trusted devices that pool their compute, models, and credits. Create a pod, invite your devices or friends, and collectively run models that no single device could handle alone.

Two Modes

Local (default) — Raft consensus across member devices. Works fully offline. State in ~/.hyperspace/pod.json. No cloud dependency.

Cloud — Create with pod create <name> --cloud. Adds a web dashboard, Drive (shared filesystem + vector search), marketplace listing, and a persistent coordinator on Hyperspace infrastructure.

Distributed Inference

Split 70B+ models across GPUs. Pipeline parallelism across the mesh.

Shared Providers

OpenRouter, Groq, Together, Fireworks, DeepInfra, xAI, Google, Mistral, Cohere, Anthropic, OpenAI, Vercel.

OpenAI-Compatible API

pk_* keys against /v1/chat/completions. Works with every client.

Always-On Agent VM

Persistent daemon on cloud VM. 9 providers supported. BYOK model.

Drive

Shared filesystem + vector search. S3-compatible backends. PDF/Word extraction.

Custom Domains

<slug>.hyperspace.sh auto-provisioned. CNAME for your own domain.

The Four Layers

Each layer is independently optional. A solo developer runs layer 4 only. A community runs 1 + 4. A team running its own GPU cluster runs all four. The same pods row in Postgres carries metadata for every layer; members come from one membership list; billing flows through two parallel pool RPCs that don't conflict.

1. Identity & access required

Slug, member roster, admin, Slack-style visibility, multi-use invitations with email-lock, max-uses, expiry, and revocation.

2. P2P GPU compute optional

Member machines join via libp2p. pod-raft consensus, capability heartbeats, layer-sharding, federation across peer pods via x402.

3. Inference billing ($) optional

OpenAI-compat gateway, BYOK provider keys, Stripe-loaded treasury, per-call $ accounting, daily/monthly caps, per-member sub-caps.

4. Hyperspace Go pool optional

Flat $10/mo team subscription. Pooled credits on a 5h sliding window. 14 curated open-weight models. Per-member sub-caps. Stacks with BYOK.

The gateway picks the cheapest path automatically: P2P first (free if a member has the model), BYOK next (your provider, your $), Hyperspace Go last (pool credits). Federation lets pods burst to peer pods via x402 micropayments when no member has the model.

Slack-style Visibility

Set per pod. Members see the same thing regardless — strangers see different things based on the visibility flag. Backwards-compat: pods that pre-date mig 047 had their visibility backfilled from older boolean flags.

Visibility	Access gate	Listed?	Behavior to a stranger
public	no auth required	yes	Anyone with the URL can read pod-shell content
unlisted	no auth required	no	Public if you have the URL — not in directories
members_only	sign-in + must be a member	yes	Stranger sees a sign-in / invite-redeem gate
invite_only	sign-in + must be a member	no	Non-members see a 404 — pod behaves like it doesn't exist

Visibility is enforced at app/pod/[slug]/page.jsx server-side and at every pod-scoped API route. Toggle from the pod admin panel or via hyperspace pod-cloud visibility <mode>.

Pricing

Pricing is per pod, not per user. Members are bundled. The Hyperspace Go pool stacks on top of layers 1–3 — you can run BYOK and Go together.

Plan	Price	Members	Hyperspace Go pool (5h sliding window)	Curated catalog
Pod Starter	$0	up to 5	—	BYOK only — bring your own keys
Pod Team	$10/mo	up to 10	10,200 credits / 5h	14 open-weight models, pooled across team
Pod Pro	$30/mo	unlimited	30,000 credits / 5h	+ frontier (Claude Opus 4.7, Sonnet 4.6, GPT 5.5, Gemini 3.1 Pro)

Per-model credit weight

Each model has a class; class determines credits-per-request. Heavy frontier models cost more per call than the cheap classes — this is what makes pooling fair.

Class	Credits/req	Reqs/5h on Team	Examples
heavy	11.6	880	GLM-5.1, Kimi K2.6, DeepSeek V4 Pro, Claude Opus 4.7, GPT 5.5
medium	7.9	1,290	MiMo V2 Pro, Claude Sonnet 4.6, Gemini 3.1 Pro
light	3.1	3,300	Qwen 3.6 Plus, MiMo V2.5
cheap	1.0	10,200	MiniMax M2.5/2.7, DeepSeek V4 Flash

Pod admins set per-member sub-caps so one member can't drain the team pool. The hg_consume_pod RPC checks both the pool window and the member's individual cap atomically before each call.

Two Creation Paths

Path A — Local First

Start from the CLI, link to the cloud later.

# 1. Create locally hyperspace pod create vega # 2. Start your node hyperspace start # 3. Pod page is already live at <slug>.hyperspace.sh # 4. Authenticate to unlock cloud features hyperspace login # 5. Link local pod to cloud dashboard hyperspace pod link # Web dashboard now mirrors local pod state

Path B — Web First

Admin creates the pod on the web, members join via link.

# Admin: generate invite on web dashboard # --> https://hyperspace.sh/pods/vega/join?code=hsi_v1.xxx # Member: open link in browser, sign in with Google # Member: install CLI curl -fsSL https://agents.hyper.space/cli | bash # Member: authenticate and join hyperspace login hyperspace pod join hsi_v1.xxx # Member: come online hyperspace start

Architecture

User's Editor (Cursor / Continue / Claude Code) | | REST + pk_* API key v Pod Gateway (port 8080, OpenAI-compatible) | +-- GET /v1/models --> list available models +-- POST /v1/chat/completions --> inference (local or distributed) +-- POST /api/v1/pod/shard/plan --> compute shard plan +-- POST /api/v1/pod/shard --> activate distributed ring | v Pod-Raft Daemon (Go, Raft consensus) | Membership, Treasury, API Keys, Invites v libp2p Mesh (Noise-encrypted TCP + WebSocket) | +-- Shard activation streams (layer tensors) +-- Token return streams (tail --> head) +-- Capability gossip (CRDT) +-- Inference routing (3-tier)

Distributed Inference Ring

Pipeline-Parallel Sharding (Qwen3.6-27B BF16 across 2 devices) H100 (80GB VRAM) Mac M3 Max (48GB unified) +---------------------------+ +---------------------------+ | Layers 0-41 | | Layers 42-63 | | Engine: llama-cpp + CUDA | -------> | Engine: MLX | | ~35GB VRAM | activations| ~19GB RAM | +---------------------------+ +---------------------------+ ^ | | token return | +<----------------------------------------+ Smart Shard Planner: - Weights layers proportional to usable memory - Auto-detects engine: MLX (macOS arm64), llama-cpp-rpc (Linux/CUDA) - Excludes CPU-only nodes (coordinator role only)

Quickstart

1. Install

curl -fsSL https://agents.hyper.space/cli | bash hyperspace --version # verify

2. Create a Pod

hyperspace pod create my-lab hyperspace start

3. Invite Members

# Basic invite (single use, 24h TTL) hyperspace pod invite # Multi-use invite for a team hyperspace pod invite --multi-use --ttl 7d # Invite with specific role hyperspace pod invite --role member hyperspace pod invite --role admin # Share the hsi_v1.* token with your team

4. Join from Another Device

hyperspace pod join hsi_v1.xxxxx.yyyyy hyperspace start

5. Pull a Model

# Browse quantizations: hyperspace models pull hf:unsloth/Qwen3.6-27B-GGUF # Download Q8_0 (near-lossless, 28.6GB): hyperspace models pull hf:unsloth/Qwen3.6-27B-GGUF --quant Q8_0 # Auto-select best model for your hardware: hyperspace models pull --auto # Or register a local GGUF: hyperspace models register /path/to/model.gguf --id my-model

6. Shard Across Devices

hyperspace pod shard qwen3.6-27b hyperspace pod shard qwen3.6-27b --force # force even if fits on one hyperspace pod shard qwen3.6-27b --dry-run # preview plan

7. Create an API Key

hyperspace pod keys create --name cursor # --> pk_a2cf2afb69928fde5abc675d862c8da4ce01ce990e39784c # Scoped key (restrict to specific models) hyperspace pod keys create --name dev --scopes "qwen3.6-27b,llama3.1-70b"

8. Use with Cursor

# Cursor Settings > Models > OpenAI Compatible # Base URL: http://localhost:8080/v1 # API Key: pk_a2cf2afb... # Model: qwen3.6-27b

9. Use with Continue

# ~/.continue/config.json { "models": [{ "provider": "openai", "title": "Vega Pod", "model": "qwen3.6-27b", "apiBase": "http://localhost:8080/v1", "apiKey": "pk_a2cf2afb..." }] }

10. Use with Claude Code

# Python: from openai import OpenAI client = OpenAI( api_key="pk_a2cf2afb...", base_url="http://localhost:8080/v1" ) r = client.chat.completions.create( model="qwen3.6-27b", messages=[{"role": "user", "content": "Write quicksort in Rust"}], max_tokens=8000 )

Full Command Reference

All commands support --json for machine-readable output.

Pod Lifecycle

Command	Description
pod create <name>	Create a new pod (you become owner)
pod create <name> --cloud	Create with cloud dashboard, Drive, and marketplace
pod join <token>	Join via invite token
pod leave	Leave current pod
pod status	Online nodes, VRAM, models
pod members	List members with roles and online status
pod invite	Generate shareable invite (hsi_v1.* token)
pod invite --role admin	Invite with admin privileges
pod invite --ttl 7d	Custom expiry (default 24h)
pod invite --multi-use	Reusable invite link
pod models	List models available across all members
pod resources	Per-node VRAM/RAM breakdown
pod gateway	Show gateway URL, port, and connection info

Coordinator / Raft

Command	Description
coord status	Raft leader, term, commit index, cluster health
coord members	Raft cluster membership and voter status
coord balance	Treasury balance for current member
coord ledger	Full transaction history
coord mint <amount>	Mint credits into treasury (owner only)
coord revoke <member>	Revoke member access
coord transfer <to> <amount>	Transfer credits between members
coord credit <member> <amount>	Credit member balance (admin)
coord invite	Generate Raft-level invite token
coord redeem <token>	Redeem a Raft invite token
coord join-cluster <addr>	Join an existing Raft cluster

Models

Command	Description
models pull hf:<repo> --quant Q8_0	Download from HuggingFace with quantization
models pull --auto	Auto-select best model for your GPU
models register <path> --id <name>	Register a local GGUF file
models list	Catalog + registered + discovered models
models downloaded	Show models downloaded locally

Distributed Inference

Command	Description
pod shard <model>	Distribute model across pod nodes
pod shard <model> --force	Force multi-node shard (even if fits on one)
pod shard <model> --dry-run	Preview shard plan without activating
pod dissolve	Dissolve the active shard ring
pod resources	Per-node VRAM/RAM breakdown with shard status

API Keys

Command	Description
pod keys create --name <n>	Generate pk_* key
pod keys create --name <n> --scopes <models>	Key restricted to specific models
pod keys list	List keys with usage stats
pod keys revoke <id>	Revoke a key

Cloud Pod (visibility, Go subscription, member caps)

Drives the cloud-side pod row at <slug>.hyperspace.sh — visibility, Hyperspace Go subscription, per-member credit caps. Different from coord commands, which drive Raft state on member machines.

Command	Description
pod-cloud visibility <mode>	Set public, unlisted, members_only, or invite_only
pod-cloud subscribe	Attach a Hyperspace Go subscription to the pod ($10/mo, pooled)
pod-cloud members	List cloud-side membership with sub-caps
pod-cloud invite [--email] [--role] [--ttl]	Mint a multi-use invitation (logged in pod_invitations)
pod-cloud cap <email> <credits>	Set a per-member 5h cap (e.g. cap a junior at 1,000)

Provider Management

Route requests to cloud providers when local inference is unavailable or a specific model is needed. Two modes: BYOK (member supplies their own API key) and Funded (pod treasury pays).

Provider	BYOK	Funded
OpenRouter	Yes	Yes
Groq	Yes	Yes
Together	Yes	Yes
Fireworks	Yes	Yes
DeepInfra	Yes	Yes
xAI	Yes	Yes
Google	Yes	Yes
Mistral	Yes	Yes
Cohere	Yes	Yes
Anthropic	Yes	Yes
OpenAI	Yes	Yes
Vercel AI Gateway	Yes	No

# Add a provider key (BYOK) hyperspace pod providers add openrouter --key "sk-or-..." # List configured providers hyperspace pod providers list

Member Budgets

Control how much each member can spend on cloud inference from the treasury.

Budget Type	Description
percent	Percentage of treasury (e.g. 10% each for 10 members)
fixed_daily	Fixed daily credit limit per member
fixed_monthly	Fixed monthly credit limit per member
unlimited	No limit (default for owner)

Every cloud inference call runs check_and_reserve_budget before dispatching to the provider. Over-budget requests return 429 Too Many Requests.

Smart Routing

The gateway routes every inference request through a priority chain. The first tier that can serve the request wins.

Routing Priority 1. Pod-Distributed Model sharded across pod members (local hardware) | Free. Fastest for loaded models. v 2. Pod-Peer Federated model on an allied pod | Free or at negotiated rate. v 3. Cloud-BYOK Member's own provider key (OpenRouter, Groq, etc.) | Billed to member's provider account. v 4. Cloud-Funded Pod treasury pays for cloud inference Budget enforcement per member.

Budget enforcement applies at every level. Even BYOK calls are logged and count toward usage analytics. Funded calls are blocked if the member exceeds their budget allocation.

Pod Drive

Shared filesystem with automatic text extraction and vector search. Every file uploaded to Drive is indexed and searchable by any pod member.

S3-Compatible

Cloudflare R2, AWS S3, or GCS as backend. Local fallback for offline pods.

Text Extraction

PDF, Word (.docx), and plaintext files automatically parsed on upload.

Vector Search

Cosine similarity across all indexed documents. Embeddings computed on upload.

Embedding Models

Model	Dimensions	Source
GTE-small	384	Built-in (default)
Ollama embeddings	768	Local Ollama instance
OpenAI text-embedding-3-small	1536	Cloud (BYOK or funded)

Plan Quotas

Plan	Storage	Max Files
Free	10 GB	500
Pro	100 GB	5,000
Ultra	1 TB	50,000

Pod VM

An always-on agent daemon running on a cloud VM. The pod provisions and manages the lifecycle automatically — you choose the provider and model.

Supported Providers

Provider	Notes
Oracle Cloud	Free tier (ARM A1, 4 OCPU, 24GB RAM)
Scaleway	Stardust/DEV1-S instances
Fly.io	Shared-cpu, performance-2x
Vercel	Serverless functions (cron-based)
Vultr	Cloud compute (vc2) instances
AWS Lightsail	$3.50/mo nano instances
DigitalOcean	Basic droplets
Linode	Nanode 1GB instances
Hetzner	CX11 / CAX11 (ARM) instances

The VM auto-provisions the cheapest available instance matching your requirements. Uses BYOK credentials for the cloud provider.

# Provision a VM for the pod agent hyperspace pod vm create --provider oracle --model qwen3.6-27b # Check VM status hyperspace pod vm status # Destroy VM hyperspace pod vm destroy

Services (Apps)

Deploy lightweight services that run alongside your pod. Services are accessible from the pod dashboard and via your custom domain.

Supported Runtimes

Runtime	Description
python	Python 3.11+ with pip dependencies
node	Node.js 20+ with npm/pnpm
docker	Any Docker image
shell	Bash script executed on cron or webhook
static	Static HTML/CSS/JS served from Drive

Services are deployed from the pod web dashboard or via the CLI. Each service gets a URL at <slug>.hyperspace.sh/apps/<name>.

# Deploy a Python service hyperspace pod services deploy ./my-api --runtime python --name my-api # List running services hyperspace pod services list # View logs hyperspace pod services logs my-api

Connectors

Sync external data sources into Pod Drive. Connectors run on a schedule and automatically index new content for vector search.

Source	Description
Google Drive	Sync folders or shared drives
Notion	Sync pages and databases
GitHub	Sync repository files (code, docs, issues)
Slack	Sync channel messages and threads
Web Crawl	Crawl and index a URL or sitemap
S3 Bucket	Mirror an existing S3/R2/GCS bucket

# Add a GitHub connector hyperspace pod connectors add github --repo "org/repo" --token "ghp_..." # List connectors hyperspace pod connectors list # Trigger manual sync hyperspace pod connectors sync github

Custom Domains

Every pod gets a <slug>.hyperspace.sh subdomain auto-provisioned via Cloudflare Worker. You can also bring your own domain.

URL Layout

Path	Description
/	Pod dashboard (status, members, models)
/files	Drive file browser and search
/apps/<name>	Deployed services
/webhook/<path>	Incoming webhook endpoints
/agents/*	Agent API and status pages

Custom Domain Setup

# 1. Add your domain hyperspace pod domains add "ai.example.com" # 2. Set DNS records at your registrar: # CNAME ai.example.com --> pods-proxy.hyperspace.sh # TXT _hyperspace.ai.example.com --> <challenge-token> # 3. Verify hyperspace pod domains verify "ai.example.com" # SSL auto-provisioned via Cloudflare

Templates

Reusable agent recipes. A template bundles a model configuration, system prompt, tools, and services into a shareable package.

Visibility	Description
public	Listed in the Hyperspace marketplace, anyone can clone
unlisted	Accessible via direct link, not listed in marketplace
private	Only visible to pod members

# Create a template from current pod config hyperspace pod templates create --name "code-reviewer" --visibility public # Clone a template into a new pod hyperspace pod create my-reviewer --template "code-reviewer" # List available templates hyperspace pod templates list

Federation & Alliances

Pods can form bilateral trust relationships with other pods. Federation extends your pod's capabilities without merging membership.

Alliance Features

Shared Model Lists

Allied pods advertise their available models. Requests route across the alliance when local capacity is full.

Shared Credit Pool

Optional. Contribute credits to a shared pool that any allied pod can draw from for cloud inference.

Public Offerings

Advertise model access at a per-request rate. Other pods pay from their treasury to use your hardware.

Trust Model

Alliances use Ed25519 federation envelopes — signed messages with a 5-minute TTL to prevent replay attacks. Trust is bilateral: both pod owners must approve. Duration is configurable (default: 30 days, renewable).

# Propose alliance with another pod hyperspace pod federation propose --pod "orion" --duration 30d # Accept incoming alliance proposal hyperspace pod federation accept --proposal "prop_abc123" # List active alliances hyperspace pod federation list # Publish a model offering (other pods pay per request) hyperspace pod federation offer --model qwen3.6-27b --rate "0.001" # Revoke alliance hyperspace pod federation revoke --pod "orion"

Consensus & Security

Every state change is an Ed25519-signed command replicated through Raft. No single member can forge transactions.

Operation	Authority
member.join	Admin/owner (or genesis)
member.leave	Self or owner
key.mint	Admin/owner (leader-signed)
treasury.transfer	Self (from own balance)
key.revoke	Admin/owner or key creator
budget.set	Admin/owner
federation.propose	Owner
federation.accept	Owner

Ed25519 Envelope Format

Every command is wrapped in a signed envelope before Raft replication:

# Envelope structure (v2) { "_v": 2, "ts": 1713888000, # Unix timestamp "nonce": "a1b2c3...", # Unique nonce (24h retention) "from": "12D3KooW...", # Sender PeerID "pk": "ed25519-pub...", # Ed25519 public key "sig": "ed25519-sig...", # Signature over material "d": { ... } # Command payload } # Signature material: # SHA256(ts_8BE || nonce || from || SHA256(d_json)) --> Ed25519

Replay protection: unique nonce (24h retention) + timestamp drift check (5 min window vs cluster time). Domain separation: namespace-scoped signing prevents cross-topic replay of GossipSub messages.

Using with Claude Code

Hyperspace provides a native MCP server for Claude Code. This lets Claude Code manage your pod, run inference, and use all 30+ tools without leaving the conversation.

Setup

# Register the MCP server and update Claude Code config hyperspace setup claude-code # Restart Claude Code to pick up the new MCP server

Once registered, Claude Code can call tools like pod_status, pod_infer, models_pull, pod_shard_model, and pod_create_api_key directly.

Creating a Claude Code Agent

You can create a Claude Code agent that uses your pod as its inference backend. This gives you a sub-agent powered by your own hardware.

# ~/.claude/agents/vega-hyperspace-pod-coder.md --- name: vega-hyperspace-pod-coder description: Coding tasks via Vega pod (qwen3.6-27b Q8_0 on H100) model: inherit --- You are a coding agent powered by the Vega pod. ## Configuration - Gateway: http://localhost:8080/v1 - API Key: pk_a2cf2afb... - Model: qwen3.6-27b (Q8_0, ~44 tok/s on H100) ## Behavior Use the Bash tool to call the pod gateway via curl. Parse the JSON response and return the result. Qwen3.6's thinking mode provides deep reasoning before answering.

The /vega Slash Command

If you add the agent to your Claude Code config, you can invoke it as a slash command:

# In Claude Code, type: /vega Write a TypeScript debounce function # Claude Code spawns the vega-hyperspace-pod-coder agent # The agent calls the pod gateway via curl # Qwen3.6-27B generates the response on your H100 # Result returned to the main Claude Code conversation

End-to-End Walkthrough

Pod to Claude Code Agent 1. Create Pod 2. Add Devices hyperspace pod create vega H100, Mac, DO join via invite | | v v 3. Pull Model 4. Create API Key models pull hf:unsloth/ pod keys create --name vega-coder Qwen3.6-27B-GGUF --quant Q8_0 --> pk_a2cf2afb... | | v v 5. Load Model 6. Gateway Serves POST /api/v1/models/ localhost:8080/v1/chat/completions qwen3.6-27b/load Authorization: Bearer pk_a2cf... | | v v 7. Claude Code Agent ~/.claude/agents/vega-hyperspace-pod-coder.md Uses Bash tool to curl the pod gateway Qwen3.6-27B generates code with thinking mode

Live Example: Vega Pod

Device	Role	GPU	Status
NVIDIA H100	Inference primary	80GB HBM3	online ~44 tok/s
Mac Studio M3 Max	Inference secondary	48GB unified	online
DO Droplet	Coordinator	8GB RAM	Raft leader

Configuration

Model	Qwen3.6-27B Q8_0 (28.6GB)
Engine	llama.cpp + CUDA (H100), MLX (Mac)
Throughput	~44 tok/s on CUDA (single node)
API Key	pk_a2cf2afb69928fde5abc675d862c8da4ce01ce990e39784c
Gateway	localhost:8080/v1
Dashboard	vega.hyperspace.sh

Objectives

The Vega pod is our reference deployment for testing three scenarios:

1. Cross-device sharding. Split Qwen3.6-27B BF16 (53.8GB) across H100 + Mac M3 Max using pipeline parallelism. The smart shard planner assigns layers proportional to each device's usable memory.

2. Benchmark sharded vs single-node. Compare latency and throughput when the model runs entirely on the H100 (Q8_0, fits in 80GB) versus sharded BF16 across two devices. Measure activation transfer overhead via libp2p.

3. Claude Code agent integration. Run a production Claude Code agent that routes coding queries through the Vega gateway. Qwen3.6-27B with thinking mode provides reasoning-heavy code generation at zero external API cost.

View pod status page →

Supported Models

Pods run any GGUF model from HuggingFace. Pull with quantization selection:

hyperspace models pull hf:unsloth/Qwen3.6-27B-GGUF --quant BF16

Model	Params	BF16	Q8_0	Min VRAM
Qwen 3.6 27B	27B	53.8 GB	28.6 GB	32 GB
Llama 3.1 70B	70B	140 GB	74 GB	80 GB
Qwen 2.5 72B	72B	144 GB	76 GB	80 GB
Gemma 3 27B	27B	54 GB	28 GB	28 GB

Thinking models (Qwen3.6): set max_tokens: 8000+ to allow room for internal reasoning. The gateway preserves reasoning_content in responses.

Changelog

v5.46 · mig 0472026-05-03

Four-layer architecture — identity / P2P compute / BYOK $ billing / Hyperspace Go pool, each independently optional, all on the same pod row
Slack-style visibility — public · unlisted · members_only · invite_only; non-members of invite_only see a 404
Hyperspace Go pool — pod-pooled $10/mo team subscription, 10,200 credits/5h sliding window, 14 curated open-weight models
Per-member caps — go_credit_cap_5h on pod_members, atomic enforcement via hg_consume_pod RPC
Multi-use invitations — pod_invitations ledger with email-lock, max-uses, expiry, revocation
Pod Pro tier — 30,000 credits/5h, unlimited members, frontier models (Claude Opus 4.7, GPT 5.5, Gemini 3.1 Pro)

v5.34.32026-04-23

Pod Drive — S3-compatible shared filesystem with vector search
Connectors — sync GitHub, Notion, Google Drive into Drive
Custom domains — <slug>.hyperspace.sh auto-provisioned via Cloudflare

v5.34.22026-04-23

Fix auto-update re-exec to use installed binary path
Fix Vercel deploy-download CI step

v5.34.12026-04-22

Direct llama-server proxy — preserves reasoning_content for thinking models
Default context increased to 16k for registered models

v5.34.02026-04-22

CUDA build-from-source — auto-builds llama.cpp with CUDA when toolkit detected
HuggingFace model pull — models pull hf:<repo> --quant BF16
Pod identity mapping — pod-raft member IDs ↔ libp2p PeerIDs for shard planning
Dynamic model registry — ~/.hyperspace/models.json
Qwen3.6-27B added to catalog and shard architecture map

v5.33.42026-04-22

Pod ring runtime initializes on restart (was only during pod join)
--force flag for pod shard

v5.33.32026-04-22

Pod keys create for local pods — leader-signed /api/keys/mint endpoint

v5.30 – v5.332026-04-19 – 2026-04-22

Smart shard planner (VRAM-weighted, MLX + llama-cpp-rpc)
Pod-raft Go binary (Raft consensus, Ed25519-signed commands)
Invite system (hsi_v1 tokens), treasury + ledger
Sidecar architecture (network/inference/pulse/matrix isolation)
Bootstrap migration to DigitalOcean (6 nodes)

v5.27 – v5.292026-04-16 – 2026-04-18

Initial pod-raft prototype (membership, invites)
Pod concept introduced in CLI
libp2p mesh integration for shard streams

Full changelog: changelog.hyper.space