Umans Code User Guide

Keep your agent working. All day.

The best open-source coding models with ultra-generous usage limits. Works with any tool where you can set a base URL and API key: Claude Code, Cursor, Zed, Copilot, OpenCode, Kilo Code, Pi, and more. Currently serving Kimi K2.6 and GLM 5.1.

We run frontier open-weight models on our own GPU infrastructure. Same SOTA coding quality as the closed labs, without the lock-in. Switch in a minute. We don't train on your data.

We publish our model tests and reviews at blog.umans.ai.

Quick Start (Recommended)

The fastest way to get started is with the Umans CLI. It handles authentication and launches Claude Code with zero configuration.

macOS & Linux

bash

# Install the CLI (one-time)
curl -fsSL https://api.code.umans.ai/cli/install.sh | bash

# Launch Claude Code with Umans backend
umans claude

Video Demo: CLI with Claude Code walkthrough

First run: The CLI opens your browser for authentication. Log in to your Umans account, and the CLI automatically receives your API key. Claude Code launches immediately with the Umans backend configured.

Subsequent runs: umans claude launches instantly using your saved credentials.

CLI Commands

bash

umans claude                          # Launch Claude Code (default: umans-coder)
umans claude --model umans-kimi-k2.6      # Use native Kimi K2.6 (next default after validation)
umans claude --model umans-glm-5.1        # Use GLM 5.1 with vision handoff
umans claude --websearch native           # Gateway websearch: Kimi-backed path
umans claude --websearch exa              # Gateway websearch: Exa-backed path
umans opencode                       # Launch OpenCode with Umans backend
umans opencode --model umans-glm-5.1      # Use GLM 5.1 on OpenCode
umans status                         # Check authentication status
umans logout                         # Remove saved credentials
umans --help                         # Show all available commands

Manual Configuration (Alternative)

If the CLI does not work for your setup (Windows users, custom environments) or you prefer to configure tools manually, use these settings:

API Endpoint

Setting	Value
Base URL	`https://api.code.umans.ai`
Anthropic Endpoint	`https://api.code.umans.ai/v1/messages`
OpenAI Endpoint	`https://api.code.umans.ai/v1/chat/completions`
Model Name	`umans-coder`

Getting Your API Key

Log in to app.umans.ai/billing
Go to your Dashboard → API Keys
Generate a new key (shown only once - copy it immediately)

Tool-Specific Setup

Claude Code Official Docs →

Using the CLI (Recommended):

bash

umans claude                              # Default: umans-coder (Kimi K2.5)
umans claude --model umans-kimi-k2.6      # Use native Kimi K2.6 (next default after validation)
umans claude --model umans-glm-5.1        # GLM 5.1 with vision handoff
umans claude --websearch native           # Gateway websearch: Kimi-backed path
umans claude --websearch exa              # Gateway websearch: Exa-backed path

--websearch selects the backend for gateway-owned websearch paths: native for the Kimi-backed search path, exa for the Exa-backed path. Applies only to umans claude.

Available Models:

Model	Provider	Capabilities	Best For
umans-coder	`Kimi K2.5*`	`Text, Vision, WebSearch`	`Default — we choose the best for you`
umans-kimi-k2.5	`Kimi K2.5`	`Text, Vision, WebSearch`	`When you specifically want Kimi`
umans-kimi-k2.6	`Kimi K2.6`	`Text, Vision, WebSearch`	`When you want the latest Kimi; becomes the default after validation`
umans-glm-5.1	`GLM 5.1`	`Text, Vision (handoff), WebSearch`	`GLM-style coding and text-first workflows`

* Today, umans-coder routes to Kimi K2.5. This may change as we continuously evaluate models. See our model selection methodology at blog.umans.ai.

Manual configuration:

bash

export ANTHROPIC_BASE_URL=https://api.code.umans.ai
export ANTHROPIC_AUTH_TOKEN=sk-your-umans-api-key
claude --model umans-coder

OpenCode Official Docs →

Using the CLI (Recommended):

bash

umans opencode                          # Default: umans-coder
umans opencode --model umans-kimi-k2.5  # Use native Kimi K2.5
umans opencode --model umans-kimi-k2.6  # Use native Kimi K2.6
umans opencode --model umans-glm-5.1    # Use GLM 5.1

umans opencode --setup writes the Umans provider config into OpenCode's global config (~/.config/opencode/opencode.json, or $XDG_CONFIG_HOME/opencode/opencode.json if set), so OpenCode Desktop picks up the same models as the CLI.

Manual configuration (add to ~/.config/opencode/opencode.json):

json

{
  "$schema": "https://opencode.ai/config.json",
  "model": "umans/umans-coder",
  "provider": {
    "umans": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Umans AI",
      "options": {
        "baseURL": "https://api.code.umans.ai/v1",
        "apiKey": "sk-your-umans-api-key"
      },
      "models": {
        "umans-coder": {
          "id": "umans-coder",
          "name": "Umans Coder",
          "modalities": { "input": ["text", "image"], "output": ["text"] }
        },
        "umans-kimi-k2.5": {
          "id": "umans-kimi-k2.5",
          "name": "Umans Kimi K2.5",
          "modalities": { "input": ["text", "image"], "output": ["text"] }
        },
        "umans-kimi-k2.6": {
          "id": "umans-kimi-k2.6",
          "name": "Umans Kimi K2.6",
          "modalities": { "input": ["text", "image"], "output": ["text"] }
        },
        "umans-glm-5.1": {
          "id": "umans-glm-5.1",
          "name": "Umans GLM 5.1",
          "modalities": { "input": ["text"], "output": ["text"] }
        }
      }
    }
  }
}

Cursor IDE Official Docs →

Video Demo: Setting up Cursor with Umans Code

Open Cursor Settings → Models
Enable Override OpenAI Base URL
Set the base URL to: https://api.code.umans.ai/v1
Paste your Umans API key in the API key field
Add one or more custom models: umans-coder (default), umans-kimi-k2.5, umans-kimi-k2.6, umans-glm-5.1 (text-only on this route)
Select the model you want in the model dropdown

Zed Official Docs →

Video Demo: Setting up Zed with Umans Code

Zed supports custom OpenAI-compatible providers. Configure it with the same base URL and API key you'd use for any BYOK tool:

Base URL: https://api.code.umans.ai/v1
API Key: Your Umans API key
Model: umans-coder, umans-kimi-k2.5, umans-kimi-k2.6, or umans-glm-5.1

Crush (Charm Bracelet) Official Docs →

Add to your Crush configuration ( ~/.config/crush/config.json ):

json

{
  "$schema": "https://charm.land/crush.json",
  "providers": {
    "umans": {
      "type": "anthropic",
      "base_url": "https://api.code.umans.ai",
      "api_key": "sk-your-umans-api-key",
      "models": [
        {
          "id": "umans-coder",
          "name": "Umans Coder",
          "default_max_tokens": 50000,
          "can_reason": true
        },
        {
          "id": "umans-kimi-k2.5",
          "name": "Umans Kimi K2.5",
          "default_max_tokens": 50000,
          "can_reason": true
        },
        {
          "id": "umans-kimi-k2.6",
          "name": "Umans Kimi K2.6",
          "default_max_tokens": 50000,
          "can_reason": true
        },
        {
          "id": "umans-glm-5.1",
          "name": "Umans GLM 5.1",
          "default_max_tokens": 50000,
          "can_reason": true
        }
      ]
    }
  }
}

Pi Extension →

Pi has a dedicated Umans provider extension that makes integration effortless. No manual base URL or env-var fiddling. Install it, sign in, paste your key, and you're done.

Install the pi-provider-umans extension from the Pi package registry.
Run /login in Pi.
Choose your Umans subscription when prompted.
Paste your Umans API key (from app.umans.ai/billing → API Keys).
Voilà. Pi is wired to the Umans backend.

Big thanks to @karutoil for the pi-provider-umans extension and the lovely open-source collaboration. We're lucky to keep building this together. ❤️

Any BYOK Tool

Umans Code exposes both OpenAI-compatible and Anthropic-compatible endpoints. If your tool lets you set a custom base URL and API key, it works. Configure with:

Base URL: https://api.code.umans.ai/v1 (OpenAI-compatible) or https://api.code.umans.ai (Anthropic-compatible)
API Key: Your Umans API key (starts with sk-)
Model: umans-coder (default), umans-kimi-k2.5, umans-kimi-k2.6, or umans-glm-5.1

Desktop App with Umans

Prefer a GUI over the terminal? Try OpenCode Desktop. It is open-source and talks to any OpenAI-compatible backend, which is the kind of tool we enjoy working with.

The models we serve are open-weight too, so your whole stack stays open and auditable. Nothing here is designed to lock you in.

One-command setup

The fastest path: run the CLI setup once and OpenCode Desktop picks up the Umans provider automatically.

bash

# Install the Umans CLI (one-time)
curl -fsSL https://api.code.umans.ai/cli/install.sh | bash

# Write the Umans provider into OpenCode's global config
umans opencode --setup

This writes to ~/.config/opencode/opencode.json (or $XDG_CONFIG_HOME/opencode/opencode.json if set). Launch OpenCode Desktop and select any umans/* model from the model picker.

Manual configuration

If you'd rather edit the config file yourself, use the JSON snippet from .

Code from Your Phone

Keep a long-running agent working on your main machine and drive it from your phone over a secure private network. The stack is four pieces that each do one thing well:

tmux: persistent terminal sessions that survive disconnects.
Tailscale: zero-config mesh VPN; your machine gets a private 100.x.x.x IP reachable only from your own devices. No port-forwarding.
Termius: mobile SSH client with proper key handling and a touchscreen-friendly terminal.
umans claude: your agent, running inside the tmux session.

Based on Emre Işık's walkthrough: Code from your phone like a boss.

1. Prepare your machine (macOS / Linux)

bash

# Install tmux, Tailscale, and the Umans CLI
brew install tmux tailscale
curl -fsSL https://api.code.umans.ai/cli/install.sh | bash

# Sign in to Tailscale and note your IP (100.x.x.x)
sudo tailscale up
tailscale ip -4

Sane tmux defaults (optional):

bash

cat > ~/.tmux.conf <<'EOF'
set -s escape-time 1
set -g mouse on
set -g default-terminal "screen-256color"
set -g history-limit 10000
set -g base-index 1
setw -g pane-base-index 1
EOF

2. Enable SSH

macOS: System Settings → General → Sharing → enable Remote Login.
Linux: sudo systemctl enable --now ssh.
Verify locally: ssh yourusername@localhost.

3. Add a reattach-or-create helper

Add this to ~/.zshrc (or ~/.bashrc) so one command always drops you back into the same session:

bash

function umans-tmux() {
  if tmux has-session -t umans 2>/dev/null; then
    tmux attach -t umans
  else
    tmux new -s umans 'umans claude'
  fi
}

4. Connect from your phone

Install the Tailscale app on your phone and sign in with the same account.
Install Termius (or any SSH client you prefer).
Add a new host in Termius:
- Host: your Tailscale IP (100.x.x.x)
- Username: your machine username
- Port: 22
Connect, then run umans-tmux.

5. Survive every disconnect

Ctrl+b then d detaches the session. The agent keeps running.
Reconnect later from anywhere and run umans-tmux again to reattach exactly where you left off.
tmux list-sessions shows every session, including detached ones.

Personal Assistants (Telegram, WhatsApp, Discord)

Drive Umans from the chat apps you already live in. Tools like AutoClaw expose Telegram, WhatsApp, and Discord channels that you can wire up to any OpenAI-compatible model, including Umans.

Heads up: these assistants run on your machine and act on messages you send (or receive). Review the permissions you grant to each channel carefully: anything the assistant can do, the bridged chat can trigger.

Configure Umans as the model

Open the app's Settings → Models & API, click Add Model, and fill in:

Field	Value
Provider	`Custom`
Model ID	`umans-coder`
Display Name	`umans-coder`
API Key	`sk-your-umans-api-key`
API Protocol	`OpenAI`
Base URL	`https://api.code.umans.ai/v1`

Click Connectivity Test to verify the endpoint, then Save. Any Umans model works here: swap umans-coder for umans-kimi-k2.6 or umans-glm-5.1 if you prefer.

Connect a channel

Follow the app's own instructions to connect Telegram, WhatsApp, or Discord. Once a channel is linked and Umans is selected as the model, messages you send in that channel are handled by the assistant running locally on your machine.

API Reference

Anthropic-Compatible Endpoints

Umans Code implements the Anthropic Messages API.

POST /v1/messages

bash

curl -N -X POST https://api.code.umans.ai/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: sk-your-umans-api-key" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "umans-coder",
    "messages": [{"role": "user", "content": "Hello!"}],
    "max_tokens": 4096,
    "stream": true
  }'

OpenAI-Compatible Endpoints

Umans Code also implements the OpenAI Chat Completions API.

POST /v1/chat/completions

bash

curl -N -X POST https://api.code.umans.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-umans-api-key" \
  -d '{
    "model": "umans-coder",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'

Models

Our Philosophy: We believe in serving the best open-source models available. We continuously evaluate and filter models to ensure your agents stay productive all day—without the decision fatigue of choosing between dozens of options.

Available Models

Model	Base	Best For	Trade-off
umans-coder	`Kimi K2.5*`	`Default — we choose the best for you`	`Routes to our top pick (may change over time)`
umans-kimi-k2.5	`Kimi K2.5`	`When you specifically want Kimi`	`Zero overhead, native multimodal`
umans-kimi-k2.6	`Kimi K2.6`	`When you want the latest Kimi`	`Becomes the default once validation is complete`
umans-glm-5.1	`GLM 5.1`	`GLM-style coding and text-first workflows`	`Vision via smart composition; OpenAI-compatible route stays text-only`

* Today, umans-coder routes to Kimi K2.5. This may change as we continuously evaluate models. Read more at blog.umans.ai.

How to Choose

Use umans-coder (default) to let us choose the best model for you. We continuously evaluate and select what works best for most coding tasks. Today this is Kimi K2.5.
Use umans-kimi-k2.5 when you specifically want native Kimi K2.5. Overall best experience. It leads benchmarks on vision-heavy workflows and document understanding tasks.
Use umans-kimi-k2.6 when you specifically want native Kimi K2.6. It becomes the default once validation is complete.
Use umans-glm-5.1 when you want GLM 5.1 behind the Umans route. Well-suited for text-first coding workflows.

Benchmark Comparison

We believe in transparency. Select a benchmark below to see how our served models compare across different capabilities. Our open-weight models match (and often beat) frontier closed labs on agentic coding and research.

Agentic software engineering on harder, more realistic tasks. Higher is better.

April 2026

Kimi K2.6 (umans-kimi-k2.6)

58.6%

GLM-5.1 (umans-glm-5.1)

58.4%

GPT-5.4

57.7%

Gemini 3.1 Pro

54.2%

Claude Opus 4.6

53.4%

Kimi K2.5 (umans-kimi-k2.5)

50.7%

🚀 Agentic coding leader: On SWE-Bench Pro, Kimi K2.6 (58.6) and GLM-5.1 (58.4) top GPT-5.4 (57.7), Gemini 3.1 Pro (54.2), and Claude Opus 4.6 (53.4). Open-weight models beating frontier closed labs.

🧠 Agentic research leader: On DeepSearchQA, Kimi K2.6 reaches 92.5 f1, ahead of Claude Opus 4.6 (91.3), Gemini 3.1 Pro (81.9), and GPT-5.4 (78.6).

Sources: Moonshot (Kimi K2.6), Z.ai (GLM-5.1), Anthropic (Opus 4.7), SWE-Bench

Note: Scores come from each provider's published benchmarks. Peer numbers (GPT-5.4, Claude Opus 4.6, Gemini 3.1 Pro, Kimi K2.5) use the Moonshot harness, which matches Anthropic's own self-report for Opus 4.6 on SWE-Bench Pro (53.4) and HLE w/ tools (53.0). Kimi K2.6 and Kimi K2.5 are Moonshot-reported; GLM-5.1 is Z.ai-reported (BrowseComp figure uses Z.ai's "w/ Context Manage" setup). Different providers use slightly different evaluation harnesses, so numbers can vary across sources.

Model Information API

For programmatic access to current model information, including context windows, pricing, and capabilities:

bash

curl https://api.code.umans.ai/v1/models/info | jq

This public endpoint returns up-to-date information about all available models, their capabilities, and current pricing.

Web Search

When your agent needs to look something up, the gateway runs the search itself and hands the results back to the model. These calls show up in your usage breakdown next to model requests, with the chosen backend in the Served column.

Backends

Backend	How it works	Pick when
native	`Kimi-powered search built into the gateway. No third-party traffic leaves our infrastructure.`	`You want the leanest, lowest-latency path and trust the model-native search quality.`
exa	`Exa-backed search routed through the gateway. Results come back in the same shape the model expects.`	`You want richer, neural-ranked results for harder lookups (changelogs, niche topics, recent docs).`

Choosing a backend

Set it once per Claude Code session via the CLI:

bash

umans claude --websearch native    # Kimi-backed
umans claude --websearch exa       # Exa-backed

Or set it per request when calling the API directly:

text

X-Umans-Websearch-Provider: native
X-Umans-Websearch-Provider: exa

If neither is set, the gateway uses its environment default. The override only takes effect on requests that actually carry a web search tool, and only on routes where the gateway owns the search step (some upstream models run search themselves, in which case the header is moot).

What we record

Only usage metrics: the bucket timestamp, requested model alias, served backend, route, status, and token counts. We do not store the search query, the results, or anything from your conversation. The gateway also does not log per-call payloads in production logs.

Troubleshooting

CLI Issues

"Command not found: umans"

Ensure ~/.local/bin or /usr/local/bin is in your PATH
Run source ~/.bashrc or source ~/.zshrc after installation

"Authentication failed"

Run umans logout to clear saved credentials
Run umans claude again to re-authenticate

Browser does not open

Copy the URL shown in the terminal and open it manually
The CLI displays a localhost callback URL - authentication will complete when you visit the URL

Connection Issues

"401 Unauthorized"

Your API key may be expired or revoked
Generate a new key in the Dashboard

"Rate limit exceeded"

You have hit your plan's usage limits
Check your usage in the Dashboard or upgrade your plan

Streaming interruptions

For long-running sessions, some networks may drop idle connections
Check your network stability or try a wired connection

Windows-Specific

The Umans CLI is not yet available for Windows. Use the manual configuration method with your preferred tool:

Set environment variables in PowerShell:

powershell

$env:ANTHROPIC_BASE_URL="https://api.code.umans.ai"
$env:ANTHROPIC_AUTH_TOKEN="sk-your-umans-api-key"

Or configure directly in your tool's settings using the manual configuration values above

FAQ

What models does Umans Code use?

Umans Code serves the best open-source models available. We do the hard work of evaluating and selecting so you don't have to. Currently:

umans-coder — Our recommended default. We continuously evaluate and route to what works best (today: Kimi K2.5)
umans-kimi-k2.5 — Explicitly choose native Kimi K2.5 for vision-heavy workflows and document understanding
umans-kimi-k2.6 — Explicitly choose native Kimi K2.6, ahead of its promotion to default
umans-glm-5.1 — GLM 5.1 behind the Umans route, well-suited for text-first coding workflows

We publish our model evaluations and reviews at blog.umans.ai.

How can you offer this pricing sustainably?

Fair question. We're not reselling API calls or burning cash on each request. A few choices, taken together, make the economics work:

We run inference on our own GPU infrastructure. No per-token margin paid upstream to a frontier lab.
We serve open-weight models, selectively. Just the best coding models (Kimi, GLM), and only the ones with architectures that scale efficiently. We don't serve every model on the market.
We tune for agent workloads, not high-interactivity chat. Long-running sessions have a different shape; our SLOs reflect that and let us serve more developers per GPU.
We accept thinner margins than the frontier labs. SemiAnalysis has shown that in 2026 the closed labs are earning healthy inference margins. That's not where we choose to sit.

Openness is also a principle, not just a cost lever: we believe the best coding models will be open-weight, and building on them lets us pass savings on to developers.

Can I use my own Claude Code license?

Yes. If you have a Claude Code subscription with Anthropic, you can use claude to run Claude Code with your Anthropic subscription. Use umans claude when you want to use Claude Code powered by Umans (best open-source model with unlimited tokens). Switch between them anytime.

Is my data secure?

Your code and conversations are processed through our infrastructure. We do not train on your data. Enterprise customers can opt for self-hosted deployments where all data remains within their infrastructure.

What happens if I hit my usage limit?

The API will return a rate limit error. You can monitor your usage in the Dashboard and upgrade your plan if needed. Limits reset according to your billing cycle.

Can I use the same API key for multiple machines?

Yes, across your own machines. The key must not be shared with other people, including teammates or shared CI runners. Usage counts against your plan's limits across all your machines, so the $50 plan with parallel sessions is the right fit if you run from several boxes. For teammates or automations, see team seats and service accounts.

Support

Need help?

Discord

Join our community

contact@umans.ai

Dashboard

Manage your account

Umans Code User Guide

Quick Start (Recommended)

macOS & Linux

CLI Commands

Manual Configuration (Alternative)

API Endpoint

Getting Your API Key

Tool-Specific Setup

Claude Code Official Docs →

Available Models: More details →

OpenCode Official Docs →

Cursor IDE Official Docs →

Zed Official Docs →

Crush (Charm Bracelet) Official Docs →

Pi Extension →

Any BYOK Tool

Desktop App with Umans

One-command setup

Manual configuration

Code from Your Phone

1. Prepare your machine (macOS / Linux)

2. Enable SSH

3. Add a reattach-or-create helper

4. Connect from your phone

5. Survive every disconnect

Personal Assistants (Telegram, WhatsApp, Discord)

Configure Umans as the model

Connect a channel

API Reference

Anthropic-Compatible Endpoints

OpenAI-Compatible Endpoints

Models

Available Models

How to Choose

Benchmark Comparison

Model Information API

Web Search

Backends

Choosing a backend

What we record

Troubleshooting

CLI Issues

Connection Issues

Windows-Specific

FAQ

What models does Umans Code use?

How can you offer this pricing sustainably?

Can I use my own Claude Code license?

Is my data secure?

What happens if I hit my usage limit?

Can I use the same API key for multiple machines?

Support

Available Models: