Generic client example
Many developer tools and agent clients support custom OpenAI-compatible providers. Cosine Inference is designed to fit that pattern.
What you need
Section titled “What you need”To use a generic compatible client with Cosine Inference, you usually need:
- a Cosine bearer token
- a custom provider entry that points the client at
https://api.cosine.sh - the provider wire API set to
responses
Base URL
Section titled “Base URL”Use:
https://api.cosine.shThe client should then call the gateway’s /responses endpoint and use
/models for model discovery where applicable.
Authentication
Section titled “Authentication”Cosine Inference uses a bearer token.
For a typical OpenAI-compatible client, the usual shape is:
- define a custom provider under
model_providers - point
env_keyat the environment variable that stores your Cosine token - set
base_urltohttps://api.cosine.sh - use the Responses wire API
Generic setup pattern
Section titled “Generic setup pattern”Rather than prescribing one client-specific config format here, use your client’s custom provider documentation and apply these Cosine-specific values:
- provider name:
Cosine - base URL:
https://api.cosine.sh - wire API:
responses - auth: bearer token from your chosen environment variable
Example values
Section titled “Example values”If your client expects the common OpenAI-compatible provider fields, the values to substitute are:
name = "Cosine"base_url = "https://api.cosine.sh"wire_api = "responses"env_key = "COSINE_API_KEY"Then export your token before launching the client:
export COSINE_API_KEY="your-cosine-token"Client-specific references
Section titled “Client-specific references”If you want a concrete example for a popular tool, Codex CLI is one compatible option. Use the upstream docs and substitute the Cosine values above:
Billing
Section titled “Billing”When you route a compatible client through Cosine Inference, usage is billed through Cosine’s inference gateway rather than through a separate model-provider-specific setup inside the client.
For the higher-level billing model, see Inference Overview.