Skip to content

Generic client example

Many developer tools and agent clients support custom OpenAI-compatible providers. Cosine Inference is designed to fit that pattern.

To use a generic compatible client with Cosine Inference, you usually need:

  • a Cosine bearer token
  • a custom provider entry that points the client at https://api.cosine.sh
  • the provider wire API set to responses

Use:

https://api.cosine.sh

The client should then call the gateway’s /responses endpoint and use /models for model discovery where applicable.

Cosine Inference uses a bearer token.

For a typical OpenAI-compatible client, the usual shape is:

  • define a custom provider under model_providers
  • point env_key at the environment variable that stores your Cosine token
  • set base_url to https://api.cosine.sh
  • use the Responses wire API

Rather than prescribing one client-specific config format here, use your client’s custom provider documentation and apply these Cosine-specific values:

  • provider name: Cosine
  • base URL: https://api.cosine.sh
  • wire API: responses
  • auth: bearer token from your chosen environment variable

If your client expects the common OpenAI-compatible provider fields, the values to substitute are:

name = "Cosine"
base_url = "https://api.cosine.sh"
wire_api = "responses"
env_key = "COSINE_API_KEY"

Then export your token before launching the client:

Terminal window
export COSINE_API_KEY="your-cosine-token"

If you want a concrete example for a popular tool, Codex CLI is one compatible option. Use the upstream docs and substitute the Cosine values above:

When you route a compatible client through Cosine Inference, usage is billed through Cosine’s inference gateway rather than through a separate model-provider-specific setup inside the client.

For the higher-level billing model, see Inference Overview.