# Billing Source: https://agent37.com/docs/agents-api/billing Prepaid and managed in the dashboard: one workspace wallet funds compute and managed usage. Billing is prepaid. You top up one workspace wallet in the dashboard, and everything draws from it: instance compute and managed usage. ## How the balance works * **\$1 free to start.** Your first visit to the dashboard grants a one-time \$1 credit, enough to run your first instance for days. Until your first real top-up, the workspace runs one instance of any template, up to the 2 vCPU / 4 GB shape; a larger shape returns `403 tier_limit`. Your first top-up unlocks the 4/8 and 8/16 shapes, and up to 10 instances; once your top-ups total \$500 the cap rises to 50. See [Instance limits](#instance-limits). * **Top up in the dashboard.** Add funds at [dashboard/cloud/billing](https://www.agent37.com/dashboard/cloud/billing). Top-ups are \$5 to \$1000 each, paid through Stripe. * **Or top up automatically.** Set a rule once and the wallet refills itself; see [Automatic top-up](#automatic-top-up). * **Instances draw it down.** Compute bills hourly, prepaid one day at a time, for as long as the instance exists, stopped or running. * **Managed usage draws it down.** Managed LLM, Brave search, and Composio calls are metered at cost against the same wallet, gated by each instance's [budget](/docs/agents-api/budgets). * **Delete to stop billing.** Deleting an instance refunds the unused remainder of its prepaid day, prorated to the hour. There are no balance or billing endpoints on `/v1`. The wallet, top-ups, and the ledger are managed entirely in the [dashboard](https://www.agent37.com/dashboard/cloud/billing). ## Instance limits How many instances a workspace can run at once depends only on how much it has topped up: | Workspace | Instances | | ------------------------------------ | -------------------------------- | | Free (before your first top-up) | 1, up to the 2 vCPU / 4 GB shape | | Topped up at least once (any amount) | 10, all shapes | | Top-ups total \$500 or more | 50, all shapes | The cap rises on its own: your first top-up lifts the workspace to 10 instances, and once your top-ups total \$500 it becomes 50 — nothing to request or configure. Need more than 50? Email [vishnu@agent37.com](mailto:vishnu@agent37.com) or use the chat bubble in the dashboard. These are caps on how many instances you can run at once, not a balance requirement: each instance simply [bills its own compute](#the-daily-cycle) from the wallet (a day at create, then a day per renewal), with no extra reserve for holding several. ## Compute pricing Compute is priced from the instance's `resources`: | Resource | Rate | | -------- | ------------------------- | | vCPU | \$0.80 per vCPU per month | | RAM | \$0.70 per GB per month | | Disk | \$0.09 per GB per month | Applied to the shapes at their default disk: | Shape | Price at the default disk | | ----------------------------------------------------------------- | ------------------------- | | 1 vCPU / 3 GB RAM / 6-20 GB disk (`agent37-hermes-small` only) | \$3.44 | | 2 vCPU / 4 GB RAM / 6-20 GB disk (default for standard templates) | \$4.94 | | 4 vCPU / 8 GB RAM / 20-40 GB disk | \$10.60 | | 8 vCPU / 16 GB RAM / 40-80 GB disk | \$21.20 | Disk is any whole number of GB within the shape's range and defaults to the range minimum; disk above the minimum adds \$0.09 per GB per month. See [Instances](/docs/agents-api/instances) for how to pick a shape with `resources` on create. ## The daily cycle The monthly price is the rate; the wallet is debited one day at a time (the hourly rate, monthly / 730, times 24). **Create debits the first day.** `POST /v1/instances` charges one day of compute up front. The debit is the create gate: if the wallet cannot cover it, the create returns `402 insufficient_balance` and nothing is provisioned. ```json theme={null} { "error": { "code": "insufficient_balance", "message": "This instance costs $0.0068 per hour, billed one day in advance ($0.1624). Add balance to your workspace and try again." } } ``` If provisioning fails after the debit, for example a custom template's image cannot be pulled, the day is refunded in full. **Each instance bills on its own.** Every instance you run debits its own day at create and then a day per renewal — there is no extra balance reserve for running several at once. How many you can run at once is set by your [instance limit](#instance-limits), not by your balance. **Renewal every 24 hours.** Each renewal extends the paid period by exactly 24 hours from the previous `paid_through` and debits one more day, on the instance's own schedule. **Stopped instances still bill.** An instance bills as long as it exists. Stopping releases its CPU and RAM but keeps its disk and host placement, so a stopped instance renews like a running one. Delete it to stop billing. **Delete refunds the remainder.** Deleting an instance refunds the unused part of the prepaid day, prorated, with a minimum of one hour billed. ## Automatic top-up Without automatic top-up, instances are stopped when the wallet runs out. Set a rule at [dashboard/cloud/billing](https://www.agent37.com/dashboard/cloud/billing) to keep them running: when the balance falls below your threshold, your card is charged for a fixed amount. The rule is checked every hour, and once more right before any instance would be suspended, so instances stay up as long as the card charges. * Any manual top-up saves its card, and automatic top-up charges the most recently saved one. Add funds once before enabling the rule. * The threshold can be \$1 to \$1000. The purchase amount follows the usual \$5 to \$1000 top-up range. * If a charge fails, automatic top-up turns itself off and emails the workspace admins. Add funds manually, which saves a new card, then re-enable it. ## Paid through and past due Every instance object reports its billing state in two fields: Unix timestamp in epoch seconds. Compute is paid through this moment; each renewal extends it by 24 hours. `true` when a renewal could not be covered and the instance was force-stopped. Cleared by a funded start. ```json theme={null} { "id": "ab12cd34ef", "status": "stopped", "paid_through": 1783641600, "past_due": true } ``` If a renewal cannot be covered by the wallet, and [automatic top-up](#automatic-top-up) is off or could not charge, the instance is force-stopped, flagged `past_due: true`, and the workspace admins are emailed. Top up the wallet, then `POST /v1/instances/{id}/start`: a funded start re-debits a day anchored at the start time, so suspended time is never billed, and clears the flag. A start without funds returns `402 insufficient_balance`. Branch on the error `code`, not the message. See [Errors](/docs/agents-api/errors) for the full list and the response shape. ## Managed usage Managed LLM, Brave search, and Composio calls debit the same wallet at cost, but only within each instance's budget: a monthly cap that resets each UTC month plus optional one-time top-up headroom. The default cap is \$0, so an instance spends nothing on managed services until you raise its cap or top it up. When a managed call is refused, the 402 reason tells you which pot ran dry: `insufficient_balance` means the wallet is empty, `instance_budget_exhausted` means the wallet has funds but the instance hit its cap. See [Budgets](/docs/agents-api/budgets) for the endpoints, rates, and both 402 reasons. ## Ledger and spend visibility Every wallet movement is recorded: top-ups, the signup credit, day debits, and prorated refunds land in an append-only ledger, while managed usage is metered per call into its own usage ledger and shown as a per-instance spend breakdown. Review the ledger, your balance, and the spend breakdown at [dashboard/cloud/billing](https://www.agent37.com/dashboard/cloud/billing). For managed spend on a single instance, `GET /v1/instances/{id}/usage` returns a monthly rollup; compute prepay is dashboard-only and never appears there. # Managed services & budgets Source: https://agent37.com/docs/agents-api/budgets Every instance ships with managed LLM, Brave search, and Composio credentials: metered at cost, gated by a per-instance budget you control. Every instance is created with managed credentials for three services: LLM calls, web search (Brave), and app integrations (Composio). The agent uses them out of the box. Managed calls route and meter through Agent37, so there are no provider or integration keys for you to manage. Each managed call is metered at cost against your workspace wallet, and a per-instance budget caps how much each instance can spend. ## Rates | Service | Rate | | ------------------ | -------------------------------------------- | | Managed LLM | Provider cost, passed through with no markup | | Web search (Brave) | \$0.005 per call (5,000 micros) | | Composio | \$0.000114 per call (114 micros) | All money fields are integer micros: USD x 1,000,000, so \$1.00 is `1000000`. Managed spend debits the workspace wallet, the same wallet that pays for compute. Top it up at [agent37.com/dashboard/cloud/billing](https://www.agent37.com/dashboard/cloud/billing). ## How budgets work A budget has two parts: * **Monthly cap.** A spending ceiling that resets at the start of each UTC month. It defaults to \$0, so managed calls are refused until you grant headroom. * **One-time credit.** A ceiling that persists across months and is consumed only after the monthly portion is exhausted. Budget figures are ceilings, not money. The workspace wallet is the only pot of dollars; raising a cap or topping up an instance moves no funds. The sum of caps across your instances can exceed the wallet balance, which is fine: caps bound each instance, the wallet bounds the total. ## Set a budget at create Pass `budget` in the create body to grant headroom from the first call. Monthly managed-spend cap in micros. Resets each UTC month. One-time headroom in micros. Persists until consumed. ```bash curl theme={null} curl -X POST https://api.agent37.com/v1/instances \ -H "Authorization: Bearer sk_live_..." \ -H "Content-Type: application/json" \ -d '{ "name": "research-bot", "budget": { "monthly_cap_micros": 5000000, "credit_micros": 1000000 } }' ``` ```python python theme={null} import requests H = {"Authorization": "Bearer sk_live_...", "Content-Type": "application/json"} instance = requests.post( "https://api.agent37.com/v1/instances", headers=H, json={ "name": "research-bot", "budget": {"monthly_cap_micros": 5000000, "credit_micros": 1000000}, }, ).json() ``` ```javascript node theme={null} const H = { "Authorization": "Bearer sk_live_...", "Content-Type": "application/json", }; const instance = await (await fetch("https://api.agent37.com/v1/instances", { method: "POST", headers: H, body: JSON.stringify({ name: "research-bot", budget: { monthly_cap_micros: 5000000, credit_micros: 1000000 }, }), })).json(); ``` ## Endpoints | Method | Path | Returns | | ------- | ---------------------------------- | ------------------------------------------------ | | `GET` | `/v1/instances/{id}/budget` | `200` the budget object | | `PATCH` | `/v1/instances/{id}/budget` | `200` the updated budget object | | `POST` | `/v1/instances/{id}/budget/top-up` | `200` the updated budget object | | `GET` | `/v1/instances/{id}/usage` | `200` `{ period, total_micros, by_integration }` | All three budget endpoints return the same budget object. ## The budget object The monthly spending ceiling. Managed spend counted against the cap this month. `monthly_cap_micros` minus `monthly_consumed_micros`, floored at 0. The UTC month the counters cover, formatted `YYYY-MM`. One-time headroom left. Consumed only after the monthly portion is exhausted. Epoch seconds of the last budget write. The budget is first written when the instance is created; cap changes, top-ups, and managed spend all update it. ```bash curl theme={null} curl https://api.agent37.com/v1/instances/ab12cd34ef/budget \ -H "Authorization: Bearer sk_live_..." ``` ```json response theme={null} { "monthly_cap_micros": 5000000, "monthly_consumed_micros": 412380, "monthly_remaining_micros": 4587620, "monthly_period": "2026-06", "credit_remaining_micros": 1000000, "updated_at": 1781136000 } ``` ## Set the monthly cap `PATCH /v1/instances/{id}/budget` sets the cap for the current and future months. It takes effect immediately. The new monthly cap in micros. Must be a non-negative integer. ```bash curl theme={null} curl -X PATCH https://api.agent37.com/v1/instances/ab12cd34ef/budget \ -H "Authorization: Bearer sk_live_..." \ -H "Content-Type: application/json" \ -d '{ "monthly_cap_micros": 20000000 }' ``` ```python python theme={null} budget = requests.patch( "https://api.agent37.com/v1/instances/ab12cd34ef/budget", headers=H, json={"monthly_cap_micros": 20000000}, ).json() ``` ```javascript node theme={null} const budget = await (await fetch( "https://api.agent37.com/v1/instances/ab12cd34ef/budget", { method: "PATCH", headers: H, body: JSON.stringify({ monthly_cap_micros: 20000000 }), }, )).json(); ``` Setting the cap to `0` turns managed services off for the instance once any remaining top-up headroom is consumed. ## Add one-time headroom `POST /v1/instances/{id}/budget/top-up` adds to `credit_remaining_micros`. Use it for a burst of work you don't want to bake into the monthly cap. Headroom to add, in micros. Must be a positive integer. Up to 64 characters matching `^[A-Za-z0-9_-]{1,64}$`. Repeating a request with the same key returns the current budget without adding again, so retries are safe. ```bash curl theme={null} curl -X POST https://api.agent37.com/v1/instances/ab12cd34ef/budget/top-up \ -H "Authorization: Bearer sk_live_..." \ -H "Content-Type: application/json" \ -d '{ "amount_micros": 10000000, "idempotency_key": "june-burst-1" }' ``` ```python python theme={null} budget = requests.post( "https://api.agent37.com/v1/instances/ab12cd34ef/budget/top-up", headers=H, json={"amount_micros": 10000000, "idempotency_key": "june-burst-1"}, ).json() ``` ```javascript node theme={null} const budget = await (await fetch( "https://api.agent37.com/v1/instances/ab12cd34ef/budget/top-up", { method: "POST", headers: H, body: JSON.stringify({ amount_micros: 10000000, idempotency_key: "june-burst-1", }), }, )).json(); ``` ## Read managed spend `GET /v1/instances/{id}/usage?month=YYYY-MM` returns the instance's managed-spend rollup for one UTC month. Omit `month` for the current month; an invalid value returns 400. The UTC month covered, formatted `YYYY-MM`. Total managed spend for the month. Per-service breakdown. `llm` carries `cost_micros`, `calls`, `input_tokens`, and `output_tokens`; `brave` and `composio` carry `cost_micros` and `calls`. ```bash curl theme={null} curl "https://api.agent37.com/v1/instances/ab12cd34ef/usage?month=2026-06" \ -H "Authorization: Bearer sk_live_..." ``` ```json response theme={null} { "period": "2026-06", "total_micros": 412380, "by_integration": { "llm": { "cost_micros": 391582, "calls": 42, "input_tokens": 184032, "output_tokens": 96110 }, "brave": { "cost_micros": 20000, "calls": 4 }, "composio": { "cost_micros": 798, "calls": 7 } } } ``` Usage covers managed spend only. Compute prepay never appears here; the full billing ledger lives in the dashboard, not on `/v1`. See [Billing](/docs/agents-api/billing). ## When a managed call is refused A managed call that can't be covered fails with a 402 carrying one of two reasons. The instance keeps running either way; only managed calls are refused, and the refusal surfaces inside the instance on the call the agent was making, so the agent typically reports it in its reply. | Reason | What happened | Fix | | --------------------------- | ------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------ | | `insufficient_balance` | The workspace wallet is empty. | Top up the wallet at [agent37.com/dashboard/cloud/billing](https://www.agent37.com/dashboard/cloud/billing). | | `instance_budget_exhausted` | The wallet has funds, but this instance hit its budget. | Raise the monthly cap with `PATCH .../budget`, or add headroom with `POST .../budget/top-up`. | For the hosting API error catalog, see [Errors](/docs/agents-api/errors). ## Connecting apps The agent connects apps two ways. In conversation, ask it to connect Gmail, Slack, Notion, or any other Composio-supported app, and it replies with an authorization link your user opens to grant access. Or drive the flow over the Hosting API — see [App integrations](/docs/agents-api/integrations). ## Bring your own keys To run a service on your own account, put your own provider key inside the instance with [exec](/docs/agents-api/exec) or the agent's in-instance config. Calls made with your own keys go straight to the provider and never touch the managed meter or the budget. # Send a message Source: https://agent37.com/docs/agents-api/chat The core call: POST /v1/responses on your instance URL runs a turn through your agent, streams or returns the result, and continues the thread. `POST /v1/responses` is the core call. You make it against your instance, not against `api.agent37.com`: every instance serves its own chat API at `https://{instanceId}.agent37.app`, the `url` of the default port in the create response. This page uses `https://ab12cd34ef.agent37.app`. Authenticate with the same `sk_live_` key you use on the hosting API. The call is agentic by default: the agent can browse, run code, use a terminal, read and write files, call connected tools, and reason across many steps before answering. ## Request body Request bodies are capped at 2 MB; anything larger returns `413 payload_too_large`. The message or task, a plain string. There is no image field; to attach files, upload them first and list their paths in `files`. See [Sessions and models](/docs/agents-api/sessions) for how history carries across turns. Paths of files on the instance to attach to this turn, typically the `path` returned by [`PUT /v1/files/content`](/docs/agents-api/files). Each must name an existing file on the instance, or the call returns `400 validation_error`. The paths are appended to the input, and the agent reads them from disk. Continue an existing conversation. Omit it to start a new one; the response returns the new session's id. The harness owns sessions and creates one on first use, so an id it has not seen simply starts a fresh thread under that id rather than erroring. `true` returns a Server-Sent Events stream; `false` returns the finished response as one JSON body. See [Streaming](/docs/agents-api/streaming). The LLM to run this turn on. Omit it to use the session's current model (the instance default on a new session). List what the instance can run with `GET /v1/models`; see [Sessions and models](/docs/agents-api/sessions). The model's provider, for example `anthropic`. Both `model` and `provider` are set per turn, and sending them on a continuation updates the session's stored pair for the turns that follow. How hard the model thinks: `none`, `minimal`, `low`, `medium`, `high`, or `xhigh`. Up to 16 key/value pairs, at most 64 KB serialized. Echoed back on the response object, never interpreted. Which agent harness runs the turn, `hermes` or `openclaw`. Omit it to use the instance's configured default (`hermes` on `agent37-hermes`). Routing is per request: the gateway keeps no session-to-agent binding, so if a session runs on a non-default harness, send `agent` on every turn of it. Targeting a harness the instance was not provisioned with returns `503 agent_unavailable`. `chat` runs one turn and replies. `goal` is reserved: sending it returns `400 validation_error` today. `instance_id` in the body is accepted and ignored. The URL names the instance: one gateway per instance, so there is nothing to route. ## Response The response object. Ids are 32-character hex strings; timestamps from the gateway are epoch milliseconds. The response id. Use it to reconnect or cancel the turn. The conversation this turn belongs to. Reuse it on the next call to continue the thread. `in_progress`, then a terminal `completed`, `failed`, or `cancelled`. The agent that ran the turn, `hermes` or `openclaw`. The model the turn ran on, `null` when none was set. The model's provider, `null` when none was set. The agent's final answer. Always a string, empty if the turn produced none. Token counts and cost for the turn: `{ input_tokens, output_tokens, cost_usd }`. `cost_usd` is absent or `null` when the provider did not report a cost. Set when the turn failed: `{ code, message, param?, hint? }`. See [Errors](/docs/agents-api/errors). Your request metadata, echoed back verbatim. When the turn started, epoch milliseconds. A failed turn does not reject the HTTP call. The POST still returns 200 with `status: "failed"` and `error` set. Branch on `status`, not on the HTTP code. On a non-streaming call the gateway sends the `200` and headers as soon as the turn starts, then keeps the connection alive with a whitespace tick every 25 seconds while the agent works. The JSON body arrives when the turn finishes, prefixed by that whitespace — leading whitespace is valid JSON, so standard parsers handle it unchanged. Don't treat the early headers as the response being ready. ## Example ```bash curl theme={null} curl https://ab12cd34ef.agent37.app/v1/responses \ -H "Authorization: Bearer sk_live_..." \ -H "Content-Type: application/json" \ -d '{ "input": "Research the top 3 EV makers, write a memo." }' ``` ```python python theme={null} import requests r = requests.post( "https://ab12cd34ef.agent37.app/v1/responses", headers={ "Authorization": "Bearer sk_live_...", "Content-Type": "application/json", }, json={"input": "Research the top 3 EV makers, write a memo."}, ).json() ``` ```javascript node theme={null} const r = await (await fetch("https://ab12cd34ef.agent37.app/v1/responses", { method: "POST", headers: { "Authorization": "Bearer sk_live_...", "Content-Type": "application/json", }, body: JSON.stringify({ input: "Research the top 3 EV makers, write a memo.", }), })).json(); ``` ```json response theme={null} { "id": "c91d2a7e84f04b6f9a3d5e1c0b87f4a2", "session_id": "7f3e0b6c52a949d2b1c4a8e9d0f31726", "status": "completed", "agent": "hermes", "model": null, "provider": null, "output_text": "Memo: the top 3 EV makers...", "usage": { "input_tokens": 1840, "output_tokens": 920, "cost_usd": 0.0137 }, "error": null, "metadata": null, "created": 1781136000000 } ``` Set `stream: true` to receive Server-Sent Events as the agent reasons, calls tools, and writes its answer. The terminal event carries the final `output_text` and `usage`. See [Streaming](/docs/agents-api/streaming). ## Continue a conversation The first message omits `session_id` and starts a session. The reply returns a `session_id`; pass it on the next message to continue the same thread. The session holds the full history, so you never resend a transcript: you send only the new input. ```bash curl theme={null} # 1. start a session: omit session_id curl https://ab12cd34ef.agent37.app/v1/responses \ -H "Authorization: Bearer sk_live_..." \ -H "Content-Type: application/json" \ -d '{ "input": "Research the top 3 EV makers, write a memo." }' # -> { "id": "c91d2a7e84f04b6f9a3d5e1c0b87f4a2", # "session_id": "7f3e0b6c52a949d2b1c4a8e9d0f31726", # "status": "completed", ... } # 2. continue it: reuse the session_id, send only the new input curl https://ab12cd34ef.agent37.app/v1/responses \ -H "Authorization: Bearer sk_live_..." \ -H "Content-Type: application/json" \ -d '{ "session_id": "7f3e0b6c52a949d2b1c4a8e9d0f31726", "input": "Make it shorter, add a quote." }' ``` ```python python theme={null} import requests BASE = "https://ab12cd34ef.agent37.app" H = {"Authorization": "Bearer sk_live_...", "Content-Type": "application/json"} # 1. start a session: omit session_id first = requests.post( f"{BASE}/v1/responses", headers=H, json={"input": "Research the top 3 EV makers, write a memo."}, ).json() # 2. continue it: reuse the session_id, send only the new input requests.post( f"{BASE}/v1/responses", headers=H, json={ "session_id": first["session_id"], "input": "Make it shorter, add a quote.", }, ) ``` ```javascript node theme={null} const BASE = "https://ab12cd34ef.agent37.app"; const H = { Authorization: "Bearer sk_live_...", "Content-Type": "application/json", }; // 1. start a session: omit session_id const first = await (await fetch(`${BASE}/v1/responses`, { method: "POST", headers: H, body: JSON.stringify({ input: "Research the top 3 EV makers, write a memo.", }), })).json(); // 2. continue it: reuse the session_id, send only the new input await fetch(`${BASE}/v1/responses`, { method: "POST", headers: H, body: JSON.stringify({ session_id: first.session_id, input: "Make it shorter, add a quote.", }), }); ``` **One active turn per session.** A session runs one response at a time. Sending new input while one is in flight returns `409 session_busy`. Use another session, or cancel the running turn first. To list a user's threads, read a thread's history, delete one, or pick a model, see [Sessions and models](/docs/agents-api/sessions). ## Follow up on a response Every response has an id you can use after the call returns. | Method | Path | Returns | | ------ | --------------------------- | ----------------------------------------------------------- | | `GET` | `/v1/responses/{id}/stream` | `200` an SSE stream: replays all events, then attaches live | | `POST` | `/v1/responses/{id}/cancel` | `200` the current response object | `GET /v1/responses/{id}/stream` replays every event so far in order, then stays attached live, so a dropped connection never loses the answer — including after the turn has finished, while the response is still retained. See [Streaming](/docs/agents-api/streaming) for the replay window. `POST /v1/responses/{id}/cancel` takes no body and stops a running turn, best effort. It returns 200 with the current response object. Cancelling a finished response is a no-op that returns its terminal state, still 200. Cancel does not rewind. Whatever the agent has already done (files written, emails sent, tools called) is not undone. The response ends with `status: "cancelled"`. ## Status values A response moves from `in_progress` to exactly one terminal status. | Status | Meaning | | ------------- | ----------------------------------------------------- | | `in_progress` | The turn is running. | | `completed` | The turn finished and `output_text` holds the answer. | | `failed` | The turn ended on an error; `error` says why. | | `cancelled` | You stopped the turn with `cancel`. | # Build a chat app Source: https://agent37.com/docs/agents-api/chat-app Give every user their own always-on agent. The simplest thing to build on the Agent API, and where most teams start. The pattern is four calls: create one instance per user at signup, start one session per chat thread, list sessions for the sidebar, and fetch one session for the open thread. Everything on this page, runnable: create instances from a table, stream replies token by token, list and reopen threads, cancel a turn. Express plus vanilla JS, no build step. Clone it, add your key, `npm start`. ## One key, two base URLs Two base URLs, one key: `https://api.agent37.com` manages instances, and each instance serves its own chat API at `https://{instanceId}.agent37.app` (the id is the hostname). See [Core concepts](/docs/agents-api/concepts). The snippets below use a small shorthand client so the calls stay readable. `api` is the hosting API; `agentOf(id)` is one user's agent. ```javascript node theme={null} const HEADERS = { Authorization: `Bearer ${process.env.AGENT37_API_KEY}`, "Content-Type": "application/json", }; function makeClient(base) { return { get: (path) => fetch(base + path, { headers: HEADERS }).then((r) => r.json()), post: (path, body) => fetch(base + path, { method: "POST", headers: HEADERS, body: JSON.stringify(body ?? {}), }).then((r) => r.json()), }; } const api = makeClient("https://api.agent37.com"); const agentOf = (id) => makeClient(`https://${id}.agent37.app`); ``` ## The shape of it When a user signs up, create one [instance](/docs/agents-api/instances) for them, tagged with your own user id. That instance is their agent from then on. ```javascript node theme={null} const inst = await api.post("/v1/instances", { user: "u_882", name: "chat-u_882", budget: { credit_micros: 1000000 }, }); await db.users.update("u_882", { instanceId: inst.id }); // inst.id is bare, e.g. "ab12cd34ef", and doubles as the hostname: // https://ab12cd34ef.agent37.app ``` Omitting `template` gives you `agent37-hermes`, the default, on the default 2 vCPU / 4 GB RAM / 6 GB disk shape, billed from your workspace wallet (see [Billing](/docs/agents-api/billing)). The `budget.credit_micros` field grants one-time managed-spend headroom so the agent's LLM calls work from the first message; without it the per-instance [budget](/docs/agents-api/budgets) defaults to \$0. The call is synchronous and returns `201` with `status: "running"`: the instance's computer is up. The agent inside is still booting — usually seconds, but up to a few minutes on a cold host — so before the first message poll `GET /v1/health` on the instance URL until it answers with `"ok": true` (see [Health and version](/docs/agents-api/sessions#health-and-version)). Store `inst.id` on the user row. Each thread is a session on the user's instance. Send the first turn to the instance URL with no `session_id`; the reply mints one. Store it on your thread row, then send `session_id` plus the new `input` on every later turn. The session keeps the full history, so you never resend a transcript. ```javascript node theme={null} const agent = agentOf(user.instanceId); // new thread: first turn, no session_id const first = await agent.post("/v1/responses", { input: "Research the top 3 EV makers, write a memo.", }); await db.threads.create({ userId: "u_882", sessionId: first.session_id }); // first.session_id, e.g. "7f3e0b6c52a949d2b1c4a8e9d0f31726" // later turns: session_id and the new input only const reply = await agent.post("/v1/responses", { session_id: thread.sessionId, input: "Make it shorter, add a quote.", }); render(reply.output_text); ``` Stream every reply so the UI fills in as the agent reasons, calls tools, and writes. Set `stream: true` and read the Server-Sent Events; see [Streaming](/docs/agents-api/streaming) for the full event list and a client parser. `GET /v1/sessions` on the instance URL lists the harness's sessions, newest first, without history. ```javascript node theme={null} const { agent, data } = await agentOf(user.instanceId).get("/v1/sessions"); // Hermes: [{ id, title, model, message_count, started_at, last_active, preview }] // timestamps are epoch milliseconds ``` On Hermes the list already carries a `title`, and you can set it with `PATCH /v1/sessions/{id}` (`{ "title": "..." }`) — the first user message usually makes a good default. Harnesses that do not store a title answer `405`; for those, keep titles in your own database keyed by `session_id`. `GET /v1/sessions/{id}` returns the session with its full transcript in `history`, in order. ```javascript node theme={null} const session = await agentOf(user.instanceId).get( `/v1/sessions/${thread.sessionId}` ); // session.history: [{ id, session_id, role, content, thinking?, created_at }] ``` Render each message by `role` (`user`, `assistant`, or `system`). When a user deletes a thread, `DELETE /v1/sessions/{id}` removes it; see [Sessions](/docs/agents-api/sessions). ## Handle a busy session A session runs one response at a time. Posting a new turn while one is in flight returns `409`: ```json theme={null} { "error": { "code": "session_busy", "message": "A response is already running on this session.", "hint": "Cancel the running response, or start another session." } } ``` Two good ways to handle it in a chat UI: * **Disable the composer** while a turn runs, and re-enable it when the reply arrives (the non-streaming call returning, or the terminal streaming event). * **Offer a stop button** that calls `POST /v1/responses/{id}/cancel` on the instance URL. With `stream: true` the first event, `response.created`, hands you the response id immediately, which is what makes the button possible. Cancel is best effort: the response ends with `status: "cancelled"`, and whatever the agent already did is not undone. The [hermes-chat example](https://github.com/agent37-platform/examples/tree/main/hermes-chat) wires up both: the composer locks while a turn is in flight, and the stop button cancels it. Other threads are unaffected: each session has its own lock, so one user can run turns in several threads at once. # Core concepts Source: https://agent37.com/docs/agents-api/concepts The two planes, the resource model, and the money model behind every Agent37 Cloud call. Agent37 Cloud is two APIs that share one key, and three resources that nest. ## Two planes, one key | Plane | Base URL | What it serves | | --------------- | ---------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **Hosting API** | `https://api.agent37.com/v1` | Manages the fleet: [instances](/docs/agents-api/instances), [templates](/docs/agents-api/templates), [budgets](/docs/agents-api/budgets), [exec](/docs/agents-api/exec). | | **Agent API** | `https://{instanceId}.agent37.app` | The agent itself. Every instance serves its own API: `POST /v1/responses` for [chat](/docs/agents-api/chat), plus `/v1/sessions`, `/v1/files`, `/v1/models`, `/v1/health`, and `/v1/version`. | Both planes take the same `Authorization: Bearer sk_live_...` header. Mint keys in the [dashboard](https://www.agent37.com/dashboard/cloud/api-keys); each key is scoped to one workspace. On the hosting API the key selects your workspace. On an instance URL, the platform edge authenticates the key, verifies the instance belongs to your workspace, and forwards the request to the gateway running inside the instance. So you create an instance with one call to `api.agent37.com`, then talk to it at its own hostname: ```bash theme={null} curl https://ab12cd34ef.agent37.app/v1/responses \ -H "Authorization: Bearer sk_live_..." \ -H "Content-Type: application/json" \ -d '{ "input": "Research the top 3 EV makers, write a memo." }' ``` Instance URLs require auth on every request: the `Authorization` Bearer for API calls, or a time-boxed [signed URL](/docs/agents-api/urls#browser-access-with-signed-urls) to open a preview URL in a browser tab. An unauthenticated request gets a 401. ## The resource model | Concept | What it is | | ------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **Instance** | An isolated, always-on computer that runs your agent. Built from a [template](/docs/agents-api/templates). Persistent until you delete it, and billed hourly — prepaid a day at a time — for as long as it exists. Create one per end user. | | **Session** | One conversation on an instance. A message starts one; reuse its `session_id` to continue. An instance can hold many. | | **Response** | One agentic turn: your input, the agent's work, its reply. Stream it live, or reconnect to its stream by id. | **The agent is not the model.** The *template* installs the agent software, the *gateway* inside the instance runs your sessions on it (Hermes and OpenClaw are the live agents today; Claude Code and Codex are coming soon), and the *model* is the LLM the agent thinks with, chosen per turn as `model` + `provider`. ### How they fit together * Create an **instance** once per end user with `POST /v1/instances` on the hosting API. It keeps files, connected accounts, and memory across every session. * Start a **session** by sending a message to the instance's own URL: `POST https://{instanceId}.agent37.app/v1/responses`. Omit `session_id` and the gateway mints a new session; the reply carries the `session_id` you reuse to continue the thread. See [Sessions](/docs/agents-api/sessions). * Each message produces a **response**. Stream it live with `stream: true` (see [Streaming](/docs/agents-api/streaming)), or reconnect a dropped stream with `GET /v1/responses/{id}/stream`. ## Templates declare ports, instances snapshot them A template declares which ports its image listens on, with at most one marked `default`. When you create an instance, it snapshots the template's ports, and the instance object returns a URL per port: * The **default port** gets the instance's own URL, `https://{instanceId}.agent37.app`. The default template, `agent37-hermes`, serves the gateway there, so the default port URL is the chat URL. * **Non-default ports** each get a derivable **preview URL**, `https://{instanceId}-{port}.agent37.app` — for example `https://ab12cd34ef-9119.agent37.app`. See [Instance and preview URLs](/docs/agents-api/urls) for how routing works, and [Templates](/docs/agents-api/templates) for declaring ports on your own images. ## One wallet, per-instance caps Your workspace has exactly one pot of money: the wallet, funded by top-up from the [billing dashboard](https://www.agent37.com/dashboard/cloud/billing). Two things draw on it: * **Compute.** Each instance bills hourly, prepaid one day at a time: a day at create, then a day on each renewal. See [Billing](/docs/agents-api/billing). * **Managed usage.** Every instance gets managed LLM, Brave search, and Composio credentials, metered at cost as the agent uses them. Per-instance [budgets](/docs/agents-api/budgets) are caps, not money. A budget bounds how much managed spend an instance may pull from the wallet: a monthly cap that resets each UTC month (default \$0) plus one-time top-up headroom. Raising a cap moves no funds; an instance with a generous cap and an empty wallet still gets refused. ## Conventions * **Instance ids** are bare 10-character lowercase alphanumerics, like `ab12cd34ef`. No prefixes. The id doubles as the DNS label in the instance URL. * **Session and response ids** are 32-character hex strings minted by the gateway, like `7f3e0b6c52a949d2b1c4a8e9d0f31726`. * **Money** is integer micros: USD x 1e6, in `*_micros` fields. * **Timestamps** are epoch seconds on the hosting API and epoch milliseconds on the agent API. The [App integrations](/docs/agents-api/integrations) endpoints are the exception: they pass Composio's native shapes through unchanged, with millisecond timestamps. * **Lists** wrap results in `{ "data": [...] }`. Instance and session lists are newest first. (The App integrations endpoints again pass Composio's native paginated and connection shapes through instead.) An instance's `status` is one of `provisioning`, `running`, `stopping`, `stopped`, `starting`, `restarting`, `updating`, `failed`, `deleting`, or `deleted`; a response's `status` is `in_progress`, `completed`, `failed`, or `cancelled`. ## Next steps Fork a white-label, multi-tenant dashboard and rebrand it — the fastest way to ship. Create, size, and manage the agent's computer. The core call and its response shape. Pick a catalog agent or bring your own image. Cap each instance's managed spend. # Custom agent image Source: https://agent37.com/docs/agents-api/custom-image Start from the Hermes base, add your tools and skills, publish to a public registry, and run instances from your own image — optionally on your own model. You can run instances on your own Docker image: start from the Hermes base, add the CLIs, skills, and config your agent needs, publish it to a public registry, and [register it as a template](/docs/agents-api/templates). You can also point the agent at your own model. A complete, copy-able example — a `Dockerfile`, an example skill, a `register.sh`, and a tiny LLM proxy — lives in [`custom-agent-image/`](https://github.com/agent37-platform/starter-kit/tree/main/examples/custom-agent-image), ready to copy into your own repo. This page is the walkthrough; [Templates → build on the Hermes base image](/docs/agents-api/templates#build-on-the-hermes-base-image) is the reference for the contract. ## 1. Start from the base Your `Dockerfile` builds on `ghcr.io/agent37-platform/hermes-base` and adds your layers. This example installs a CLI into `/usr/local/bin` and a skill into the default-skills directory: ```dockerfile theme={null} FROM ghcr.io/agent37-platform/hermes-base:latest USER root RUN apt-get update && apt-get install -y --no-install-recommends your-cli \ && rm -rf /var/lib/apt/lists/* COPY your-skill/ /usr/local/share/agent37/default-skills/your-skill/ USER node ``` Bake binaries into `/usr/local/bin`, skills into `/usr/local/share/agent37/default-skills/` (the entrypoint copies them to `~/.hermes/skills` at boot), and everything else into `/usr/local` or `/opt` — never `/home/node` or `/home/linuxbrew`, which are masked at runtime. Keep the base `ENTRYPOINT`. See [the contract](/docs/agents-api/templates#build-on-the-hermes-base-image). `:latest` tracks the newest base, so getting-started builds never go stale. For reproducible production builds, pin a date tag instead — the current one is on the [Templates](/docs/agents-api/templates#build-on-the-hermes-base-image) page. Test the build before you publish: ```bash theme={null} docker build --platform linux/amd64 -t my-agent . ``` ## 2. Publish to a public registry Push the image to any public registry. GitHub Container Registry is the least setup, because GitHub Actions can build and push it with the built-in token — no secrets. The example folder ships a reference workflow that does exactly this. The image must be **public**, built for **`linux/amd64`**, and **at most 5 GB** compressed. On GHCR, make the package public once after the first push (**Packages** → **Package settings** → **Change visibility** → **Public**); Agent37 pulls anonymously. Build for `linux/amd64` even on an Apple Silicon Mac (`docker build --platform=linux/amd64 ...`). A stray arm64 image fails to start. ## 3. Register and run Register the published image as a template (pin a tag, not `latest`), then create an instance: ```bash curl theme={null} curl -X POST https://api.agent37.com/v1/templates \ -H "Authorization: Bearer sk_live_..." \ -H "Content-Type: application/json" \ -d '{ "name": "my-custom-agent", "image_ref": "ghcr.io/you/my-agent:" }' curl -X POST https://api.agent37.com/v1/instances \ -H "Authorization: Bearer sk_live_..." \ -H "Content-Type: application/json" \ -d '{ "template": "my-custom-agent" }' ``` The result is a standard Agent37 instance running your image: same [lifecycle](/docs/agents-api/instances), [exec](/docs/agents-api/exec), and routed URLs as any other. Confirm your CLI shipped, without needing a model: ```bash curl theme={null} curl -X POST https://api.agent37.com/v1/instances//exec \ -H "Authorization: Bearer sk_live_..." \ -H "Content-Type: application/json" \ -d '{ "command": "your-cli --version" }' ``` If the instance comes up `failed` instead of `running`, your image did not boot. Read its [logs](/docs/agents-api/logs) for the entrypoint's own error. A missing binary, a bad path, or a wrong-architecture build are the usual causes, and `exec` cannot help here since there is no running container to attach to. ## 4. Bring your own model `hermes-base` is clean: it boots with no LLM provider (standard Agent37 instances use Agent37's managed model; this base is for bringing your own). To run an instance on your own model, point Hermes at any OpenAI-compatible endpoint — your own proxy, or a provider directly — by writing `~/.hermes/config.yaml` on the instance: ```yaml theme={null} model: provider: "custom:MyProvider" default: "moonshotai/kimi-k2.7-code" # the model id your endpoint serves custom_providers: - name: "MyProvider" base_url: "https://your-llm-proxy.example.com/v1" # must end in /v1 api_key: "your-proxy-token" api_mode: "chat_completions" model: "moonshotai/kimi-k2.7-code" ``` Your endpoint must serve the two OpenAI-compatible routes Hermes uses: `GET /v1/models` (to resolve the model id) and `POST /v1/chat/completions` (the turn). The example folder includes a \~40-line `llm-proxy/` that implements exactly these and forwards to OpenRouter with your key — a minimal pattern to deploy and adapt. Write the config over [exec](/docs/agents-api/exec) or the instance terminal; it lives on the persistent volume, so it survives restarts. Then [send a message](/docs/agents-api/chat) and the agent runs on your model. Want chat to work out of the box on Agent37's managed model instead? Build `FROM ghcr.io/agent37-platform/hermes:` — the full image wires the managed model — and pass a [budget](/docs/agents-api/budgets) on create. ## Keep it current Pin the base tag; `latest` floats. When you change your image, republish, point the template at the new tag with `PATCH /v1/templates/{name}`, and [update each instance](/docs/agents-api/instances) to pick it up. A skill already seeded into an instance's `~/.hermes/skills` is not overwritten on update — only fresh instances get a changed skill. # Errors Source: https://agent37.com/docs/agents-api/errors Stable, machine-readable error codes on both planes: branch on the code, show the message. Every error uses a standard HTTP status code and returns a stable, machine-readable code in a JSON `error` field — an object with a `code` on both API catalogs, a flat string on transport failures between you and the gateway. Branch on the code, never on `message` or the HTTP status alone. There are two catalogs because there are two planes, plus a short list of transport errors. Both planes take the same `sk_live_` Bearer key, but their error envelopes differ: the Agent API adds optional `param` and `hint` fields. See [Core concepts](/docs/agents-api/concepts) for the two planes. ## Hosting API errors Errors from `https://api.agent37.com/v1/*` always carry exactly `code` and `message`. There is no `param` or `hint` on this plane. ```json theme={null} { "error": { "code": "insufficient_balance", "message": "This instance costs $0.0068 per hour, billed one day in advance ($0.1624). Add balance to your workspace and try again." } } ``` A stable, machine-readable identifier. Branch on this. A human-readable description. Safe to show, but do not parse it. ### Hosting API codes | Code | HTTP | When | | ------------------------ | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `invalid_api_key` | 401 | The `sk_live_` Bearer key is missing, malformed, or revoked, on any `/v1` path. | | `invalid_request` | 400 | A request field is invalid: bad JSON, an unsupported resource shape, a direct image ref where a template name belongs, a lifecycle action in the wrong state, or a non-empty `update` body. | | `forbidden` | 403 | A write to a system template. System templates are read-only. | | `not_found` | 404 | No instance or template with that id or name in your workspace, or an unknown `/v1` path. | | `insufficient_balance` | 402 | The wallet cannot cover a day of compute (at create, or at `start` of a `past_due` instance). | | `instance_limit_reached` | 409 | The workspace is at its instance limit: one instance on the free credit, 10 once you have topped up, 50 once top-ups total \$500. Email [vishnu@agent37.com](mailto:vishnu@agent37.com) to raise it further. | | `tier_limit` | 403 | Create or resize asks for a shape larger than your plan includes. Free workspaces (before your first top-up) run any template up to the 2 vCPU / 4 GB shape; a top-up unlocks the 4/8 and 8/16 shapes. | | `capacity_unavailable` | 409 | `start` or `resize`: the pinned host cannot re-reserve this instance's compute, or fit the resize increase, right now. | | `template_conflict` | 409 | A template create or rename targets a name that already exists. | | `no_capacity` | 503 | Create only: no host can fit the requested shape right now. Safe to retry. | | `provisioning_failed` | 502/500 | A container or host operation failed (image pull, container start, lifecycle action). Failed creates are fully refunded. | Lookups are uniform: an id that belongs to another workspace returns the same 404 as an id that does not exist, and unknown `/v1` paths 404 only after your key is validated. Nothing about other workspaces leaks through error responses. ## Agent API errors Errors from the gateway at `https://{instanceId}.agent37.app/v1/*` use the same envelope plus optional `param` and `hint`. ```json theme={null} { "error": { "code": "validation_error", "message": "goal mode is not yet supported on this gateway.", "param": "mode", "hint": "Use mode \"chat\"." } } ``` The request field that was invalid. Present on `validation_error` when a specific field is at fault; a malformed JSON body has no `param`. A suggested next step, when one applies. ### Transport errors Auth, instance lookup, and routing happen on the platform between you and the gateway, and rejections there use a flat string instead of the envelope: `{"error": ""}`. Some carry a human-readable `message` (and, on a 401 with no credentials, a `docs` link); branch on the code, not on either. Check whether `error` is a string before reading `code`. | Code | HTTP | When | | ----------------------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `invalid_api_key` | 401 | No credential, or a key or signed URL that did not verify. The `message` says which, and how to authenticate a browser request. | | `not_found` | 404 | An instance URL that does not route: unknown, deleted, `failed`, or another workspace's. | | `container_unavailable` | 502 | The instance is not running, typically `stopped`. Start it and retry. | | `container_unreachable` | 502 | The instance is up but its service did not answer on the routed port — common briefly during a restart or update. Transient; retry. | | `upstream_unreachable` | 502 | The platform could not reach the instance's host. Transient; retry. | | `instance_saturated` | 503 | Too many concurrent requests in flight to this instance. Back off and retry. | | `host_mesh_not_ready` | 503 | The instance's host is still joining the platform network. Transient; retry. | | `upstream_timeout` | 504 | A call produced no response headers for \~100 seconds — typically a wedged instance. The turn may still be running: recover the result through its session, and prefer `stream: true` to see progress. | | `internal_error` | 500 | Unexpected platform error. | ### Agent API codes | Code | HTTP | When | | -------------------- | ---- | ------------------------------------------------------------------------------------------------------------ | | `validation_error` | 400 | A request field is invalid (`param` names it), the body is not valid JSON, or `mode` is `"goal"` (reserved). | | `not_a_directory` | 400 | `GET /v1/files?path=` points at something that isn't a directory. | | `response_not_found` | 404 | No response with that id. | | `file_not_found` | 404 | No file or directory at that path (`GET /v1/files`, `GET /v1/files/content`, `PATCH /v1/files`). | | `not_found` | 404 | Unknown route on the instance URL. | | `rename_unsupported` | 405 | `PATCH /v1/sessions/{id}` against a harness that cannot rename sessions (no native editable title). | | `session_busy` | 409 | A response is already running on this session. | | `title_conflict` | 409 | `PATCH /v1/sessions/{id}`: the requested title is already used by another session. | | `file_exists` | 409 | `PUT /v1/files/content` with `overwrite=false` when a file already exists at the path. | | `modified` | 412 | `PUT /v1/files/content` with `X-Expected-Mtime` when the file changed since you read it (lost-update guard). | | `payload_too_large` | 413 | A JSON request body exceeds 2 MB. The raw-body `PUT /v1/files/content` write is exempt. | | `rate_limited` | 429 | An upstream provider rate limit. Back off and retry. | | `agent_error` | 502 | The agent backend failed without a more specific code. | | `agent_unavailable` | 503 | The targeted harness is not available on this instance — not provisioned here, or down. | | `internal_error` | 500 | Unexpected gateway error. | The Agent API catalog is open-ended past this table. Failures inside the agent can surface provider-specific codes at 502 or 503 (for example a provider auth or quota error passes its raw code through, and an agent that is still warming up returns 503 with its own code). Treat any code you do not recognize as an agent-side failure: log it and show `message`. Not every failed turn is an HTTP error. `POST /v1/responses` never rejects because the agent failed mid-run: the call returns 200 with `status: "failed"` and the same error object in the response body's `error` field, and streams end with a `response.failed` event. Check `status`, not just the HTTP code. See [Send a message](/docs/agents-api/chat) and [Streaming](/docs/agents-api/streaming). ## Handle them Read `code`, then act by remedy: busy sessions get a cancel or a new session, transient codes get a retry with backoff, validation errors get fixed (read `param`), and anything unknown is agent-side. ```python python theme={null} import requests r = requests.post( "https://ab12cd34ef.agent37.app/v1/responses", headers={ "Authorization": "Bearer sk_live_...", "Content-Type": "application/json", }, json={ "input": "Research the top 3 EV makers, write a memo.", "session_id": "7f3e0b6c52a949d2b1c4a8e9d0f31726", }, ) if not r.ok: error = r.json()["error"] if isinstance(error, str): raise RuntimeError(f"transport error: {error}") # flat platform error, e.g. container_unavailable code = error["code"] if code == "session_busy": ... # cancel the running response or start another session elif code == "rate_limited": ... # transient: back off and retry elif code == "validation_error": raise ValueError(f"{error.get('param')}: {error['message']}") else: ... # unknown codes are agent-side failures: log code, show message ``` ```javascript node theme={null} const res = await fetch("https://ab12cd34ef.agent37.app/v1/responses", { method: "POST", headers: { "Authorization": "Bearer sk_live_...", "Content-Type": "application/json", }, body: JSON.stringify({ input: "Research the top 3 EV makers, write a memo.", session_id: "7f3e0b6c52a949d2b1c4a8e9d0f31726", }), }); if (!res.ok) { const { error } = await res.json(); if (typeof error === "string") { // flat platform error, e.g. "container_unavailable": see the transport table throw new Error(`transport error: ${error}`); } switch (error.code) { case "session_busy": // cancel the running response or start another session break; case "rate_limited": // transient: back off and retry break; case "validation_error": throw new Error(`${error.param}: ${error.message}`); default: // unknown codes are agent-side failures: log code, show message console.error(error.code, error.message); } } ``` The same grouping works on the Hosting API: `insufficient_balance` sends your user to the billing dashboard, `no_capacity` is retryable, and `invalid_request` means fix the request before retrying. `no_capacity` (503, hosting) and `rate_limited` (429, agent) are safe to retry with backoff. So is a `provisioning_failed` create: failed creates are fully refunded, so retrying never double-bills. ## Codes worth a closer look The workspace wallet cannot cover a charge. You see it in three places: at create, when the first day (billed in advance) cannot be debited; at `start`, when a `past_due` instance's next day cannot be re-debited; and inside agent behavior, when a managed call (LLM, Brave search, Composio) finds the wallet empty. The fix is the same everywhere: top up the wallet at `https://www.agent37.com/dashboard/cloud/billing` (\$5 minimum, \$1000 max per top-up), and enable [automatic top-up](/docs/agents-api/billing#automatic-top-up) so it does not recur. See [Billing](/docs/agents-api/billing). The wallet has funds, but this instance has used up its own managed-spend budget: the monthly cap is consumed and no one-time top-up headroom remains. Only managed calls are refused; the instance keeps running and compute billing is unaffected. Raise the cap with `PATCH /v1/instances/{id}/budget` or add headroom with `POST /v1/instances/{id}/budget/top-up`. See [Budgets](/docs/agents-api/budgets). You will not see this code on Hosting API calls: it surfaces when the agent's managed calls are refused mid-turn. A session runs one response at a time. Posting new input while a turn is in flight returns this, with a `hint` naming both ways out: cancel the running turn with `POST /v1/responses/{id}/cancel` (best effort; a finished response just returns its terminal state), or start a fresh session by omitting `session_id`. See [Sessions](/docs/agents-api/sessions). Three different walls, three different fixes. `instance_limit_reached` (409, create): the workspace is at its instance limit (one instance on the free credit, 10 once you have topped up, 50 once top-ups total \$500); delete instances you no longer need, or email [vishnu@agent37.com](mailto:vishnu@agent37.com) to raise the ceiling. `no_capacity` (503, create): no host can fit the requested shape right now; retry with backoff or pick a smaller shape, and a create that fails this way does not keep your money. `capacity_unavailable` (409, start or resize): an instance stays pinned to the host that holds its disk, and that host cannot re-reserve its CPU and RAM, or fit a resize increase, right now; retry later, or create a new instance to land elsewhere. See [Instances](/docs/agents-api/instances). Both 402 reasons are billing limits, not bugs. When a managed call is refused mid-turn, the refusal shows up in agent behavior (the turn fails or the agent reports it); the instance itself never goes down over managed spend. # Run commands Source: https://agent37.com/docs/agents-api/exec Run any shell command inside an instance from your backend, the escape hatch for anything the API does not wrap. `POST /v1/instances/{id}/exec` runs a shell command inside the instance, straight from your backend. It is the escape hatch for anything the API does not wrap as its own call. The command runs through `sh -c` as the image's default user, on the same box the agent works on, so it sees the agent's files, tools, and credentials. ## Request The shell command to run inside the instance. It is passed to `sh -c`, so pipes, redirects, and `&&` chains all work. Only `running` instances accept commands. Exec against any other status returns `400 invalid_request` (a deleted instance returns `404 not_found`). If the platform cannot reach the instance at all, you get `502 provisioning_failed`. ## Response A command that runs but exits nonzero is a normal result: you get `200` with its `exit_code`, `stdout`, and `stderr`. Errors are reserved for the platform, not your command. The command's exit code. Nonzero is still a `200`; read this to branch. Standard output, capped at 512 KB. See `truncated`. Standard error, with its own separate 512 KB cap. `true` when either stream spilled past its 512 KB cap. The middle of the output is cut and a truncation marker is left in its place. `exit_code` values 125, 126, and 127 may come from Docker rather than your command, for example 127 when the binary is not found. A command runs for up to 280 seconds, after which the call fails with `502 provisioning_failed`; start longer jobs in the background (`nohup ... &`) and poll with a second exec. ## Example ```bash curl theme={null} curl -X POST https://api.agent37.com/v1/instances/ab12cd34ef/exec \ -H "Authorization: Bearer sk_live_..." \ -H "Content-Type: application/json" \ -d '{ "command": "node --version" }' ``` ```python python theme={null} import requests resp = requests.post( "https://api.agent37.com/v1/instances/ab12cd34ef/exec", headers={"Authorization": "Bearer sk_live_..."}, json={"command": "node --version"}, ) result = resp.json() print(result["exit_code"], result["stdout"]) ``` ```javascript node theme={null} const resp = await fetch( "https://api.agent37.com/v1/instances/ab12cd34ef/exec", { method: "POST", headers: { Authorization: "Bearer sk_live_...", "Content-Type": "application/json", }, body: JSON.stringify({ command: "node --version" }), } ); const result = await resp.json(); console.log(result.exit_code, result.stdout); ``` ```json response theme={null} { "exit_code": 0, "stdout": "v24.2.0\n", "stderr": "", "truncated": false } ``` ## Build on exec Anything the API does not wrap as its own endpoint, you build on `exec`. For moving files, prefer the instance's own [files endpoints](/docs/agents-api/files) at `https://{instanceId}.agent37.app` — `PUT /v1/files/content` to upload, `GET /v1/files/content` to download — but a quick text read works over exec too. A "Download the report" button in your product can be one exec call that reads the file the agent wrote: ```bash curl theme={null} curl -X POST https://api.agent37.com/v1/instances/ab12cd34ef/exec \ -H "Authorization: Bearer sk_live_..." \ -H "Content-Type: application/json" \ -d '{ "command": "cat ~/reports/ev-makers-memo.md" }' ``` Pushing a file in is the same trick in reverse. Encode it on your side and decode it inside the instance: ```bash curl theme={null} curl -X POST https://api.agent37.com/v1/instances/ab12cd34ef/exec \ -H "Authorization: Bearer sk_live_..." \ -H "Content-Type: application/json" \ -d '{ "command": "mkdir -p ~/data && echo aGVsbG8sd29ybGQK | base64 -d > ~/data/ev-prices.csv" }' ``` For binary or large files, use the [files endpoints](/docs/agents-api/files) on the instance URL instead — `GET /v1/files/content` streams a download of any size, with no 512 KB cap — or stage them at a URL your backend controls and `curl` them down from inside the instance. # Files Source: https://agent37.com/docs/agents-api/files Browse, read, write, move, delete, and download whole folders on the instance's disk — straight from the instance URL. Files live on the instance's disk. A file's absolute `path` is its identity: there are no file ids, so the path you list is the path you read, write, move, or delete. You write a file with one call, attach the returned `path` to [a message](/docs/agents-api/chat), and download anything the agent produces by path. The base URL is your instance URL, `https://{instanceId}.agent37.app`, with the same `sk_live_` Bearer as every other call. This page uses `https://ab12cd34ef.agent37.app`. The agent's workspace — where it reads and writes by default — is `/home/user/.agent37-gateway/workspace`, and that is the default directory for a list with no `path`. These calls are not jailed to the workspace. The `sk_live_` key is the instance root: any path the key can reach on the instance's filesystem is fair game, with `~` expanding to the agent's home. Treat the key accordingly. Every timestamp here is `modified`, the file's mtime in **epoch milliseconds** — the Agent API convention (the Hosting API uses seconds). It is a number, not an ISO string. ## The file entry List responses and every write return the same `FileEntry` shape, so the `path` you get back from a write is ready to use on the next call. The basename, e.g. `leads.csv`. The resolved absolute path on the instance. This is the identity you pass to every other call and to `files` on [`POST /v1/responses`](/docs/agents-api/chat). `file`, `directory`, `symlink`, or `other` (sockets, devices, FIFOs). Size in bytes; `null` for directories. Last-modified time (mtime) in epoch milliseconds. `true` when the name starts with `.`. ## List a directory `GET /v1/files` lists one directory level. Omit `path` to list the agent's workspace; pass an absolute path or a `~/` path to list anywhere the key can reach. Entries are sorted directories first, then by name case-insensitively. The directory to list. Optional; defaults to the agent's workspace, `/home/user/.agent37-gateway/workspace`. Accepts absolute and `~/` paths. A path that exists but is not a directory returns `400 not_a_directory`. ```bash curl theme={null} curl -G https://ab12cd34ef.agent37.app/v1/files \ -H "Authorization: Bearer sk_live_..." \ --data-urlencode "path=~/.agent37-gateway/workspace" ``` ```python python theme={null} import requests listing = requests.get( "https://ab12cd34ef.agent37.app/v1/files", headers={"Authorization": "Bearer sk_live_..."}, params={"path": "~/.agent37-gateway/workspace"}, ).json() ``` ```javascript node theme={null} const listing = await (await fetch( "https://ab12cd34ef.agent37.app/v1/files?" + new URLSearchParams({ path: "~/.agent37-gateway/workspace" }), { headers: { Authorization: "Bearer sk_live_..." } }, )).json(); ``` ```json response theme={null} { "path": "/home/user/.agent37-gateway/workspace", "parentPath": "/home/user/.agent37-gateway", "entries": [ { "name": "reports", "path": "/home/user/.agent37-gateway/workspace/reports", "type": "directory", "size": null, "modified": 1781049600000, "hidden": false }, { "name": "leads.csv", "path": "/home/user/.agent37-gateway/workspace/leads.csv", "type": "file", "size": 18244, "modified": 1781049642000, "hidden": false } ], "truncated": false } ``` The resolved absolute path of the directory you listed. The parent directory's absolute path, or `null` at the filesystem root. The directory's immediate children as [`FileEntry`](#the-file-entry) objects. One level only — this never recurses. `true` when the directory holds more than 1000 entries; only the first 1000 (after sorting) are returned. ## Read, preview, or download a file `GET /v1/files/content?path=…` streams a file off the instance — typically one the agent told you it wrote. Any size; the 512 KB [exec](/docs/agents-api/exec) output cap does not apply here. The `Content-Type` is set from the file extension. The file to read. Accepts absolute and `~/` paths. A missing or empty `path`, or a path that is not a regular file, returns `400 validation_error`; no file at the path returns `404 file_not_found`. `attachment` sends `Content-Disposition: attachment` so a browser downloads the file. `inline` sends `Content-Disposition: inline` so a browser renders it (useful for previews). ```bash curl theme={null} curl -G https://ab12cd34ef.agent37.app/v1/files/content \ -H "Authorization: Bearer sk_live_..." \ --data-urlencode "path=~/.agent37-gateway/workspace/reports/ev-makers-memo.md" \ -o ev-makers-memo.md ``` ```python python theme={null} import requests r = requests.get( "https://ab12cd34ef.agent37.app/v1/files/content", headers={"Authorization": "Bearer sk_live_..."}, params={"path": "~/.agent37-gateway/workspace/reports/ev-makers-memo.md"}, ) open("ev-makers-memo.md", "wb").write(r.content) ``` ```javascript node theme={null} import fs from "node:fs"; const res = await fetch( "https://ab12cd34ef.agent37.app/v1/files/content?" + new URLSearchParams({ path: "~/.agent37-gateway/workspace/reports/ev-makers-memo.md", }), { headers: { Authorization: "Bearer sk_live_..." } }, ); await fs.promises.writeFile( "ev-makers-memo.md", Buffer.from(await res.arrayBuffer()), ); ``` Serving an agent-produced file `inline` runs it on **your** origin. HTML, SVG, and similar can execute scripts in the page that opens them, so an instance whose agent writes attacker-controlled content can run code against your users. Render untrusted files in a sandboxed frame (`