Inference overview

Production LLM inference with no subscription — pay per use over x402 or run on a prepaid OpenAI-compatible API.

Production LLM inference — frontier and open models, with no subscription. Two ways to consume it:

Pay per use — call any model endpoint and settle each request in USDC over x402. No accounts, no API keys. See Pay per use.
OpenAI-compatible API — run on a prepaid token balance through a standard OpenAI-shaped /v1 API. Top up with USDC, then spend across every model. See OpenAI-compatible API.

Settlement is on Solana via x402. Verified/TEE inference is available on select models — on-chain attested.

The full model catalog (Claude, GPT, Gemini, NVIDIA NIM, verified/TEE inference, plus open models) lives in the LLM resource family — see the AI services. The marketplace covers many providers.