Umans Code User Guide
Keep your agent working. All day.
The best open-source coding models with ultra-generous usage limits. Works with any tool where you can set a base URL and API key: Claude Code, Cursor, Zed, Copilot, OpenCode, Kilo Code, Pi, and more. Currently serving Kimi K2.6 and GLM 5.1.
We run frontier open-weight models on our own GPU infrastructure. Same SOTA coding quality as the closed labs, without the lock-in. Switch in a minute. We don't train on your data.
We publish our model tests and reviews at blog.umans.ai.
Quick Start (Recommended)
The fastest way to get started is with the Umans CLI. It handles authentication and launches Claude Code with zero configuration.
macOS & Linux
# Install the CLI (one-time)
curl -fsSL https://api.code.umans.ai/cli/install.sh | bash
# Launch Claude Code with Umans backend
umans claudeVideo Demo: CLI with Claude Code walkthrough
First run: The CLI opens your browser for authentication. Log in to your Umans account, and the CLI automatically receives your API key. Claude Code launches immediately with the Umans backend configured.
Subsequent runs: umans claude launches instantly using your saved credentials.
CLI Commands
umans claude # Launch Claude Code (default: umans-coder)
umans claude --model umans-kimi-k2.6 # Use native Kimi K2.6 (next default after validation)
umans claude --model umans-glm-5.1 # Use GLM 5.1 with vision handoff
umans claude --websearch native # Gateway websearch: Kimi-backed path
umans claude --websearch exa # Gateway websearch: Exa-backed path
umans opencode # Launch OpenCode with Umans backend
umans opencode --model umans-glm-5.1 # Use GLM 5.1 on OpenCode
umans status # Check authentication status
umans logout # Remove saved credentials
umans --help # Show all available commandsManual Configuration (Alternative)
If the CLI does not work for your setup (Windows users, custom environments) or you prefer to configure tools manually, use these settings:
API Endpoint
| Setting | Value |
|---|---|
| Base URL | https://api.code.umans.ai |
| Anthropic Endpoint | https://api.code.umans.ai/v1/messages |
| OpenAI Endpoint | https://api.code.umans.ai/v1/chat/completions |
| Model Name | umans-coder |
Getting Your API Key
- Log in to app.umans.ai/billing
- Go to your Dashboard → API Keys
- Generate a new key (shown only once - copy it immediately)
Tool-Specific Setup
Claude Code Official Docs →
Using the CLI (Recommended):
umans claude # Default: umans-coder (Kimi K2.5)
umans claude --model umans-kimi-k2.6 # Use native Kimi K2.6 (next default after validation)
umans claude --model umans-glm-5.1 # GLM 5.1 with vision handoff
umans claude --websearch native # Gateway websearch: Kimi-backed path
umans claude --websearch exa # Gateway websearch: Exa-backed path--websearch selects the backend for gateway-owned websearch paths: native for the Kimi-backed search path, exa for the Exa-backed path. Applies only to umans claude.
Available Models:
| Model | Provider | Capabilities | Best For |
|---|---|---|---|
| umans-coder | Kimi K2.5* | Text, Vision, WebSearch | Default — we choose the best for you |
| umans-kimi-k2.5 | Kimi K2.5 | Text, Vision, WebSearch | When you specifically want Kimi |
| umans-kimi-k2.6 | Kimi K2.6 | Text, Vision, WebSearch | When you want the latest Kimi; becomes the default after validation |
| umans-glm-5.1 | GLM 5.1 | Text, Vision (handoff), WebSearch | GLM-style coding and text-first workflows |
* Today, umans-coder routes to Kimi K2.5. This may change as we continuously evaluate models. See our model selection methodology at blog.umans.ai.
Manual configuration:
export ANTHROPIC_BASE_URL=https://api.code.umans.ai
export ANTHROPIC_AUTH_TOKEN=sk-your-umans-api-key
claude --model umans-coderOpenCode Official Docs →
Using the CLI (Recommended):
umans opencode # Default: umans-coder
umans opencode --model umans-kimi-k2.5 # Use native Kimi K2.5
umans opencode --model umans-kimi-k2.6 # Use native Kimi K2.6
umans opencode --model umans-glm-5.1 # Use GLM 5.1umans opencode --setup writes the Umans provider config into OpenCode's global config (~/.config/opencode/opencode.json, or $XDG_CONFIG_HOME/opencode/opencode.json if set), so OpenCode Desktop picks up the same models as the CLI.
Manual configuration (add to ~/.config/opencode/opencode.json):
{
"$schema": "https://opencode.ai/config.json",
"model": "umans/umans-coder",
"provider": {
"umans": {
"npm": "@ai-sdk/openai-compatible",
"name": "Umans AI",
"options": {
"baseURL": "https://api.code.umans.ai/v1",
"apiKey": "sk-your-umans-api-key"
},
"models": {
"umans-coder": {
"id": "umans-coder",
"name": "Umans Coder",
"modalities": { "input": ["text", "image"], "output": ["text"] }
},
"umans-kimi-k2.5": {
"id": "umans-kimi-k2.5",
"name": "Umans Kimi K2.5",
"modalities": { "input": ["text", "image"], "output": ["text"] }
},
"umans-kimi-k2.6": {
"id": "umans-kimi-k2.6",
"name": "Umans Kimi K2.6",
"modalities": { "input": ["text", "image"], "output": ["text"] }
},
"umans-glm-5.1": {
"id": "umans-glm-5.1",
"name": "Umans GLM 5.1",
"modalities": { "input": ["text"], "output": ["text"] }
}
}
}
}
}Cursor IDE Official Docs →
Video Demo: Setting up Cursor with Umans Code
- Open Cursor Settings → Models
- Enable Override OpenAI Base URL
- Set the base URL to:
https://api.code.umans.ai/v1 - Paste your Umans API key in the API key field
- Add one or more custom models:
umans-coder(default),umans-kimi-k2.5,umans-kimi-k2.6,umans-glm-5.1(text-only on this route) - Select the model you want in the model dropdown
Zed Official Docs →
Video Demo: Setting up Zed with Umans Code
Zed supports custom OpenAI-compatible providers. Configure it with the same base URL and API key you'd use for any BYOK tool:
- Base URL:
https://api.code.umans.ai/v1 - API Key: Your Umans API key
- Model:
umans-coder,umans-kimi-k2.5,umans-kimi-k2.6, orumans-glm-5.1
Crush (Charm Bracelet) Official Docs →
Add to your Crush configuration ( ~/.config/crush/config.json ):
{
"$schema": "https://charm.land/crush.json",
"providers": {
"umans": {
"type": "anthropic",
"base_url": "https://api.code.umans.ai",
"api_key": "sk-your-umans-api-key",
"models": [
{
"id": "umans-coder",
"name": "Umans Coder",
"default_max_tokens": 50000,
"can_reason": true
},
{
"id": "umans-kimi-k2.5",
"name": "Umans Kimi K2.5",
"default_max_tokens": 50000,
"can_reason": true
},
{
"id": "umans-kimi-k2.6",
"name": "Umans Kimi K2.6",
"default_max_tokens": 50000,
"can_reason": true
},
{
"id": "umans-glm-5.1",
"name": "Umans GLM 5.1",
"default_max_tokens": 50000,
"can_reason": true
}
]
}
}
}Pi Extension →
Pi has a dedicated Umans provider extension that makes integration effortless. No manual base URL or env-var fiddling. Install it, sign in, paste your key, and you're done.
- Install the pi-provider-umans extension from the Pi package registry.
- Run
/loginin Pi. - Choose your Umans subscription when prompted.
- Paste your Umans API key (from app.umans.ai/billing → API Keys).
- Voilà. Pi is wired to the Umans backend.
Big thanks to @karutoil for the pi-provider-umans extension and the lovely open-source collaboration. We're lucky to keep building this together. ❤️
Any BYOK Tool
Umans Code exposes both OpenAI-compatible and Anthropic-compatible endpoints. If your tool lets you set a custom base URL and API key, it works. Configure with:
- Base URL:
https://api.code.umans.ai/v1(OpenAI-compatible) orhttps://api.code.umans.ai(Anthropic-compatible) - API Key: Your Umans API key (starts with
sk-) - Model:
umans-coder(default),umans-kimi-k2.5,umans-kimi-k2.6, orumans-glm-5.1
Desktop App with Umans
Prefer a GUI over the terminal? Try OpenCode Desktop. It is open-source and talks to any OpenAI-compatible backend, which is the kind of tool we enjoy working with.
The models we serve are open-weight too, so your whole stack stays open and auditable. Nothing here is designed to lock you in.
One-command setup
The fastest path: run the CLI setup once and OpenCode Desktop picks up the Umans provider automatically.
# Install the Umans CLI (one-time)
curl -fsSL https://api.code.umans.ai/cli/install.sh | bash
# Write the Umans provider into OpenCode's global config
umans opencode --setupThis writes to ~/.config/opencode/opencode.json (or $XDG_CONFIG_HOME/opencode/opencode.json if set). Launch OpenCode Desktop and select any umans/* model from the model picker.
Manual configuration
If you'd rather edit the config file yourself, use the JSON snippet from .
Code from Your Phone
Keep a long-running agent working on your main machine and drive it from your phone over a secure private network. The stack is four pieces that each do one thing well:
- tmux: persistent terminal sessions that survive disconnects.
- Tailscale: zero-config mesh VPN; your machine gets a private
100.x.x.xIP reachable only from your own devices. No port-forwarding. - Termius: mobile SSH client with proper key handling and a touchscreen-friendly terminal.
- umans claude: your agent, running inside the tmux session.
Based on Emre Işık's walkthrough: Code from your phone like a boss.
1. Prepare your machine (macOS / Linux)
# Install tmux, Tailscale, and the Umans CLI
brew install tmux tailscale
curl -fsSL https://api.code.umans.ai/cli/install.sh | bash
# Sign in to Tailscale and note your IP (100.x.x.x)
sudo tailscale up
tailscale ip -4Sane tmux defaults (optional):
cat > ~/.tmux.conf <<'EOF'
set -s escape-time 1
set -g mouse on
set -g default-terminal "screen-256color"
set -g history-limit 10000
set -g base-index 1
setw -g pane-base-index 1
EOF2. Enable SSH
- macOS: System Settings → General → Sharing → enable Remote Login.
- Linux:
sudo systemctl enable --now ssh. - Verify locally:
ssh yourusername@localhost.
3. Add a reattach-or-create helper
Add this to ~/.zshrc (or ~/.bashrc) so one command always drops you back into the same session:
function umans-tmux() {
if tmux has-session -t umans 2>/dev/null; then
tmux attach -t umans
else
tmux new -s umans 'umans claude'
fi
}4. Connect from your phone
- Install the Tailscale app on your phone and sign in with the same account.
- Install Termius (or any SSH client you prefer).
- Add a new host in Termius:
- Host: your Tailscale IP (
100.x.x.x) - Username: your machine username
- Port:
22
- Host: your Tailscale IP (
- Connect, then run
umans-tmux.
5. Survive every disconnect
Ctrl+bthenddetaches the session. The agent keeps running.- Reconnect later from anywhere and run
umans-tmuxagain to reattach exactly where you left off. tmux list-sessionsshows every session, including detached ones.
Personal Assistants (Telegram, WhatsApp, Discord)
Drive Umans from the chat apps you already live in. Tools like AutoClaw expose Telegram, WhatsApp, and Discord channels that you can wire up to any OpenAI-compatible model, including Umans.
Heads up: these assistants run on your machine and act on messages you send (or receive). Review the permissions you grant to each channel carefully: anything the assistant can do, the bridged chat can trigger.
Configure Umans as the model
Open the app's Settings → Models & API, click Add Model, and fill in:
| Field | Value |
|---|---|
| Provider | Custom |
| Model ID | umans-coder |
| Display Name | umans-coder |
| API Key | sk-your-umans-api-key |
| API Protocol | OpenAI |
| Base URL | https://api.code.umans.ai/v1 |
Click Connectivity Test to verify the endpoint, then Save. Any Umans model works here: swap umans-coder for umans-kimi-k2.6 or umans-glm-5.1 if you prefer.
Connect a channel
Follow the app's own instructions to connect Telegram, WhatsApp, or Discord. Once a channel is linked and Umans is selected as the model, messages you send in that channel are handled by the assistant running locally on your machine.
API Reference
Anthropic-Compatible Endpoints
Umans Code implements the Anthropic Messages API.
POST /v1/messages
curl -N -X POST https://api.code.umans.ai/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: sk-your-umans-api-key" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "umans-coder",
"messages": [{"role": "user", "content": "Hello!"}],
"max_tokens": 4096,
"stream": true
}'OpenAI-Compatible Endpoints
Umans Code also implements the OpenAI Chat Completions API.
POST /v1/chat/completions
curl -N -X POST https://api.code.umans.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-your-umans-api-key" \
-d '{
"model": "umans-coder",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": true
}'Models
Our Philosophy: We believe in serving the best open-source models available. We continuously evaluate and filter models to ensure your agents stay productive all day—without the decision fatigue of choosing between dozens of options.
Available Models
| Model | Base | Best For | Trade-off |
|---|---|---|---|
| umans-coder | Kimi K2.5* | Default — we choose the best for you | Routes to our top pick (may change over time) |
| umans-kimi-k2.5 | Kimi K2.5 | When you specifically want Kimi | Zero overhead, native multimodal |
| umans-kimi-k2.6 | Kimi K2.6 | When you want the latest Kimi | Becomes the default once validation is complete |
| umans-glm-5.1 | GLM 5.1 | GLM-style coding and text-first workflows | Vision via smart composition; OpenAI-compatible route stays text-only |
* Today, umans-coder routes to Kimi K2.5. This may change as we continuously evaluate models. Read more at blog.umans.ai.
How to Choose
- Use
umans-coder(default) to let us choose the best model for you. We continuously evaluate and select what works best for most coding tasks. Today this is Kimi K2.5. - Use
umans-kimi-k2.5when you specifically want native Kimi K2.5. Overall best experience. It leads benchmarks on vision-heavy workflows and document understanding tasks. - Use
umans-kimi-k2.6when you specifically want native Kimi K2.6. It becomes the default once validation is complete. - Use
umans-glm-5.1when you want GLM 5.1 behind the Umans route. Well-suited for text-first coding workflows.
Benchmark Comparison
We believe in transparency. Select a benchmark below to see how our served models compare across different capabilities. Our open-weight models match (and often beat) frontier closed labs on agentic coding and research.
Agentic software engineering on harder, more realistic tasks. Higher is better.
April 2026🚀 Agentic coding leader: On SWE-Bench Pro, Kimi K2.6 (58.6) and GLM-5.1 (58.4) top GPT-5.4 (57.7), Gemini 3.1 Pro (54.2), and Claude Opus 4.6 (53.4). Open-weight models beating frontier closed labs.
🧠 Agentic research leader: On DeepSearchQA, Kimi K2.6 reaches 92.5 f1, ahead of Claude Opus 4.6 (91.3), Gemini 3.1 Pro (81.9), and GPT-5.4 (78.6).
Sources: Moonshot (Kimi K2.6), Z.ai (GLM-5.1), Anthropic (Opus 4.7), SWE-Bench
Note: Scores come from each provider's published benchmarks. Peer numbers (GPT-5.4, Claude Opus 4.6, Gemini 3.1 Pro, Kimi K2.5) use the Moonshot harness, which matches Anthropic's own self-report for Opus 4.6 on SWE-Bench Pro (53.4) and HLE w/ tools (53.0). Kimi K2.6 and Kimi K2.5 are Moonshot-reported; GLM-5.1 is Z.ai-reported (BrowseComp figure uses Z.ai's "w/ Context Manage" setup). Different providers use slightly different evaluation harnesses, so numbers can vary across sources.
Model Information API
For programmatic access to current model information, including context windows, pricing, and capabilities:
curl https://api.code.umans.ai/v1/models/info | jqThis public endpoint returns up-to-date information about all available models, their capabilities, and current pricing.
Web Search
When your agent needs to look something up, the gateway runs the search itself and hands the results back to the model. These calls show up in your usage breakdown next to model requests, with the chosen backend in the Served column.
Backends
| Backend | How it works | Pick when |
|---|---|---|
| native | Kimi-powered search built into the gateway. No third-party traffic leaves our infrastructure. | You want the leanest, lowest-latency path and trust the model-native search quality. |
| exa | Exa-backed search routed through the gateway. Results come back in the same shape the model expects. | You want richer, neural-ranked results for harder lookups (changelogs, niche topics, recent docs). |
Choosing a backend
Set it once per Claude Code session via the CLI:
umans claude --websearch native # Kimi-backed
umans claude --websearch exa # Exa-backedOr set it per request when calling the API directly:
X-Umans-Websearch-Provider: native
X-Umans-Websearch-Provider: exaIf neither is set, the gateway uses its environment default. The override only takes effect on requests that actually carry a web search tool, and only on routes where the gateway owns the search step (some upstream models run search themselves, in which case the header is moot).
What we record
Only usage metrics: the bucket timestamp, requested model alias, served backend, route, status, and token counts. We do not store the search query, the results, or anything from your conversation. The gateway also does not log per-call payloads in production logs.
Troubleshooting
CLI Issues
"Command not found: umans"
- Ensure
~/.local/binor/usr/local/binis in your PATH - Run
source ~/.bashrcorsource ~/.zshrcafter installation
"Authentication failed"
- Run
umans logoutto clear saved credentials - Run
umans claudeagain to re-authenticate
Browser does not open
- Copy the URL shown in the terminal and open it manually
- The CLI displays a localhost callback URL - authentication will complete when you visit the URL
Connection Issues
"401 Unauthorized"
- Your API key may be expired or revoked
- Generate a new key in the Dashboard
"Rate limit exceeded"
- You have hit your plan's usage limits
- Check your usage in the Dashboard or upgrade your plan
Streaming interruptions
- For long-running sessions, some networks may drop idle connections
- Check your network stability or try a wired connection
Windows-Specific
The Umans CLI is not yet available for Windows. Use the manual configuration method with your preferred tool:
- Set environment variables in PowerShell:powershell
$env:ANTHROPIC_BASE_URL="https://api.code.umans.ai" $env:ANTHROPIC_AUTH_TOKEN="sk-your-umans-api-key" - Or configure directly in your tool's settings using the manual configuration values above
FAQ
What models does Umans Code use?
Umans Code serves the best open-source models available. We do the hard work of evaluating and selecting so you don't have to. Currently:
- umans-coder — Our recommended default. We continuously evaluate and route to what works best (today: Kimi K2.5)
- umans-kimi-k2.5 — Explicitly choose native Kimi K2.5 for vision-heavy workflows and document understanding
- umans-kimi-k2.6 — Explicitly choose native Kimi K2.6, ahead of its promotion to default
- umans-glm-5.1 — GLM 5.1 behind the Umans route, well-suited for text-first coding workflows
We publish our model evaluations and reviews at blog.umans.ai.
How can you offer this pricing sustainably?
Fair question. We're not reselling API calls or burning cash on each request. A few choices, taken together, make the economics work:
- We run inference on our own GPU infrastructure. No per-token margin paid upstream to a frontier lab.
- We serve open-weight models, selectively. Just the best coding models (Kimi, GLM), and only the ones with architectures that scale efficiently. We don't serve every model on the market.
- We tune for agent workloads, not high-interactivity chat. Long-running sessions have a different shape; our SLOs reflect that and let us serve more developers per GPU.
- We accept thinner margins than the frontier labs. SemiAnalysis has shown that in 2026 the closed labs are earning healthy inference margins. That's not where we choose to sit.
Openness is also a principle, not just a cost lever: we believe the best coding models will be open-weight, and building on them lets us pass savings on to developers.
Can I use my own Claude Code license?
Yes. If you have a Claude Code subscription with Anthropic, you can use claude to run Claude Code with your Anthropic subscription. Use umans claude when you want to use Claude Code powered by Umans (best open-source model with unlimited tokens). Switch between them anytime.
Is my data secure?
Your code and conversations are processed through our infrastructure. We do not train on your data. Enterprise customers can opt for self-hosted deployments where all data remains within their infrastructure.
What happens if I hit my usage limit?
The API will return a rate limit error. You can monitor your usage in the Dashboard and upgrade your plan if needed. Limits reset according to your billing cycle.
Can I use the same API key for multiple machines?
Yes, across your own machines. The key must not be shared with other people, including teammates or shared CI runners. Usage counts against your plan's limits across all your machines, so the $50 plan with parallel sessions is the right fit if you run from several boxes. For teammates or automations, see team seats and service accounts.
Support
Need help?