Skip to content

Browser Automation

Cosine CLI can control a real browser for tasks like taking screenshots, navigating websites, filling forms, and extracting data from web pages.

This page covers the Chrome DevTools Protocol (CDP) flow for Chrome and Chromium. Safari is also supported in the current product surface, but it uses a separate desktop-hosted Safari/WebContentsView bridge rather than the Chrome CDP setup described here.

The Chrome DevTools Protocol (CDP) is a remote debugging protocol that allows external tools to control Chrome and Chromium. Cosine CLI uses CDP to:

  • Navigate websites - Load and interact with web pages
  • Take screenshots - Capture full-page or element-specific screenshots
  • Execute JavaScript - Run JS in the browser context
  • Fill forms - Input data and interact with UI elements
  • Extract data - Scrape information from web pages

Before using browser automation, you need:

  1. Google Chrome installed on your system
  2. Chrome started with remote debugging enabled
Terminal window
open -n -a "Google Chrome" --args --remote-debugging-port=9222 --user-data-dir=/tmp/cosine-chrome

This opens a new Chrome instance with:

  • --remote-debugging-port=9222 - Enables CDP on port 9222
  • --user-data-dir=/tmp/cosine-chrome - Isolated profile for Cosine

Add the CDP URL to your ~/.cosine.toml:

[browser]
cdp_url = "http://localhost:9222"

Now browser automation will be available in every session.

Override the config per session:

Terminal window
cos start --cdp-url http://localhost:9222

Or for one-shot tasks:

Terminal window
cos start --prompt "Go to example.com and take a screenshot" --cdp-url http://localhost:9222

Once configured, Cosine automatically detects the browser and exposes browser automation tools. The AI will suggest using the browser when relevant.

Taking Screenshots:

You: Take a screenshot of my deployed app at https://myapp.com
Cosine: I'll navigate to your app and capture a screenshot.
[Uses browser_navigate and browser_screenshot tools]

Web Scraping:

You: Extract all the product names from https://shop.example.com/products
Cosine: I'll visit the page and extract the product information for you.
[Uses browser_navigate and browser_extract tools]

Form Interaction:

You: Test the login form on https://myapp.com/login
Cosine: I'll navigate to the login page and test the form.
[Uses browser_navigate, browser_fill, browser_click tools]

When browser automation is enabled, Cosine has access to these tools:

ToolDescription
browser_navigateNavigate to a URL
browser_screenshotCapture a screenshot
browser_clickClick on an element
browser_fillFill a form field
browser_evaluateExecute JavaScript
browser_get_textExtract text from the page
browser_scrollScroll the page
browser_waitWait for an element or timeout

All browser actions require your approval by default (unless using --auto-accept). You’ll see:

  • The URL being visited
  • The action being performed
  • Any data being sent

Using a separate --user-data-dir is recommended because:

  • Cosine’s browser sessions are isolated
  • No access to your main Chrome cookies/logins
  • Can be easily reset by deleting the temp directory
  • CDP communication is over HTTP (locally)
  • Only use CDP on trusted networks
  • The browser runs with your user permissions

If Cosine doesn’t detect the browser:

  1. Verify Chrome is running with debugging:

    Terminal window
    # Check if port 9222 is open
    curl http://localhost:9222/json/version
  2. Verify the CDP URL is correct:

    • Use http://localhost:9222 (not ws://)
    • Don’t include /json or other paths
  3. Check for conflicting Chrome instances:

    • Close all Chrome windows
    • Start Chrome with the debugging flags shown above

If you see “connection refused” errors:

  1. Make sure Chrome is actually running
  2. Verify the port matches (default is 9222)
  3. Check firewall settings

If pages fail to load:

  1. Verify network connectivity
  2. Check if the URL is accessible in regular Chrome
  3. Look for SSL certificate issues

If Chrome is installed in a non-standard location:

Terminal window
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 --user-data-dir=/tmp/cosine-chrome

You can run multiple Chrome instances on different ports:

Terminal window
# First instance on port 9222
chrome --remote-debugging-port=9222 --user-data-dir=/tmp/cosine-chrome-1
# Second instance on port 9223
chrome --remote-debugging-port=9223 --user-data-dir=/tmp/cosine-chrome-2

Then configure Cosine:

Terminal window
# Use first instance
cos start --cdp-url http://localhost:9222
# Use second instance
cos start --cdp-url http://localhost:9223

For servers or CI/CD environments, use headless mode:

Terminal window
chrome --headless --remote-debugging-port=9222 --user-data-dir=/tmp/cosine-chrome

You can combine browser automation with MCP servers for powerful workflows:

{
"mcpServers": {
"filesystem": {
"transport": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/pz/screenshots"]
}
}
}

Now you can:

You: Take a screenshot of the dashboard and save it to my screenshots folder
Cosine: I'll capture the screenshot and save it using the filesystem MCP server.

Always start Chrome with debugging enabled before launching Cosine:

Terminal window
# Terminal 1: Start Chrome
open -n -a "Google Chrome" --args --remote-debugging-port=9222 --user-data-dir=/tmp/cosine-chrome
# Terminal 2: Start Cosine
cos start

Keep Cosine’s browser sessions separate from your main browsing:

Terminal window
# Use a dedicated user data directory
--user-data-dir=/tmp/cosine-chrome

Clean up the Chrome instance when finished:

Terminal window
# Find and kill the Chrome process
pkill -f "remote-debugging-port=9222"
# Or remove the user data directory
rm -rf /tmp/cosine-chrome

Browser automation can be resource-intensive:

  • Chrome uses significant RAM
  • Close unused tabs
  • Kill the Chrome process when done
You: Compare the current homepage design with the staging site
Cosine: I'll take screenshots of both and compare them.
You: Extract pricing information from 5 competitor websites
Cosine: I'll visit each site and extract their pricing tables.
You: Test the signup flow on our website
Cosine: I'll navigate through the signup process and report any issues.