Managed AI inference
Keyless LLM calls from your app — no API key, billed to your plan
Apps with a PocketBase sidecar can call LLMs through Percher's managed inference proxy — there's no API key to mint, store, or leak. The capsule template ships it pre-wired as src/lib/ai.ts:
import { complete, streamComplete } from "./lib/ai";
const answer = await complete("Summarize this note: " + note);
// or, for a typewriter UI:
await streamComplete(prompt, (token) => output.append(token));How it works
- The browser calls
/capsule/<app>/aiwith the signed-in user's PocketBase session token. Percher validates the session against your app's PocketBase — only signed-in users of your app can spend. - The request rides OpenRouter pinned to zero-data-retention: prompts and completions aren't stored or used for training.
- Spend is metered against the app owner's plan with a per-day cap that resets at UTC midnight: free $0.05, Starter $0.50, Maker $2, Pro $10. When the cap is hit, calls return 429 with the reset time.
- Rate limits: 30 requests/min per app, 20 requests/min per user. Output is capped at 2,000 tokens per request.
Models
openai/gpt-4o-mini (default), openai/gpt-4o, and anthropic/claude-3.5-haiku. Pass model in the request body to pick one; anything off the allowlist is rejected rather than billed blind.
Without the capsule template
Any app with [data] mode = "pocketbase" and signed-in users can call the endpoint directly: POST https://api.percher.run/capsule/<app>/ai with the PocketBase auth token in the Authorization header and an OpenAI-style messages array.