Skip to content

Choosing the Right Model

The Cosine CLI gives you access to a range of AI models. Each has different characteristics in terms of speed, cost, reasoning depth, and output style. Over time, you’ll develop your own preferences — but here’s a practical starting point.

ModelSpeedCostBest for
CodexFastMediumGeneral coding, everyday tasks
Codex SparkVery fastLowQuick edits, fast lookups
SonnetMediumMediumCreative writing, prose, nuanced output
OpusSlowHigh (3×)Deep reasoning, complex multi-step tasks
Gemini (various)SlowVariesAlternative perspective, large context tasks
KimiVery fastLowQuick first drafts, throwing ideas at the wall
MiniMaxFastLowRapid iteration

If you’re not sure which model to use, these are solid defaults:

  • Codex on High reasoning — great all-rounder for tasks involving code and structured thinking.
  • Sonnet on Medium reasoning — better for writing tasks, prose, and anything where tone and nuance matter.

You don’t need to optimise aggressively across models. The long-term goal is that model selection becomes increasingly automatic. For now, defaulting to Codex or Sonnet for most tasks is a reliable approach.

If you’re in the middle of a task and need a quick answer or small edit without interrupting your flow, switch to a faster model like Kimi or Codex Spark. Fast models are great for:

  • Sanity-checking a quick idea
  • Small, well-defined code changes
  • Getting a first draft to react to

For complex multi-step tasks running in the background — especially in Swarm Mode — the extra time a larger model takes is worth it. Use Opus or Codex on High when:

  • You’re producing something that needs to be good the first time
  • The task involves deep reasoning or multiple interdependent decisions
  • You’re running it in the background anyway and won’t be waiting

Different models have genuinely different characteristics — not just in capability, but in style and “feel.” Some users find Claude models (Sonnet, Opus) produce more visually pleasing HTML and more natural-sounding prose. Codex models tend to be more precise and structured.

Running the same task with two different models in parallel (one as the main agent, one as a fresh session with no context) is a useful technique for getting varied perspectives on the same problem.

  • Codex on High or Sonnet on Medium are good everyday defaults.
  • Use fast models (Kimi, Codex Spark) for quick in-flow tasks.
  • Use larger models (Opus, Codex on High) for complex background tasks.
  • You’ll build your own preferences over time — but don’t over-optimise early.

Next: What are MCPs and Why Do They Matter?