Skip to content

Browser Automation

Cosine CLI can control a real Chrome browser instance using the Chrome DevTools Protocol (CDP). This enables web-based tasks like taking screenshots, navigating websites, filling forms, and extracting data from web pages.

The Chrome DevTools Protocol (CDP) is a remote debugging protocol that allows external tools to control Chrome. Cosine CLI uses CDP to:

  • Navigate websites - Load and interact with web pages
  • Take screenshots - Capture full-page or element-specific screenshots
  • Execute JavaScript - Run JS in the browser context
  • Fill forms - Input data and interact with UI elements
  • Extract data - Scrape information from web pages

Before using browser automation, you need:

  1. Google Chrome installed on your system
  2. Chrome started with remote debugging enabled
Terminal window
open -n -a "Google Chrome" --args --remote-debugging-port=9222 --user-data-dir=/tmp/cosine-chrome

This opens a new Chrome instance with:

  • --remote-debugging-port=9222 - Enables CDP on port 9222
  • --user-data-dir=/tmp/cosine-chrome - Isolated profile for Cosine

Add the CDP URL to your ~/.cosine.toml:

[browser]
cdp_url = "http://localhost:9222"

Now browser automation will be available in every session.

Override the config per session:

Terminal window
cos start --cdp-url http://localhost:9222

Or for one-shot tasks:

Terminal window
cos start --prompt "Go to example.com and take a screenshot" --cdp-url http://localhost:9222

Once configured, Cosine automatically detects the browser and exposes browser automation tools. The AI will suggest using the browser when relevant.

Taking Screenshots:

You: Take a screenshot of my deployed app at https://myapp.com
Cosine: I'll navigate to your app and capture a screenshot.
[Uses browser_navigate and browser_screenshot tools]

Web Scraping:

You: Extract all the product names from https://shop.example.com/products
Cosine: I'll visit the page and extract the product information for you.
[Uses browser_navigate and browser_extract tools]

Form Interaction:

You: Test the login form on https://myapp.com/login
Cosine: I'll navigate to the login page and test the form.
[Uses browser_navigate, browser_fill, browser_click tools]

When browser automation is enabled, Cosine has access to these tools:

ToolDescription
browser_navigateNavigate to a URL
browser_screenshotCapture a screenshot
browser_clickClick on an element
browser_fillFill a form field
browser_evaluateExecute JavaScript
browser_get_textExtract text from the page
browser_scrollScroll the page
browser_waitWait for an element or timeout

All browser actions require your approval by default (unless using --auto-accept). You’ll see:

  • The URL being visited
  • The action being performed
  • Any data being sent

Using a separate --user-data-dir is recommended because:

  • Cosine’s browser sessions are isolated
  • No access to your main Chrome cookies/logins
  • Can be easily reset by deleting the temp directory
  • CDP communication is over HTTP (locally)
  • Only use CDP on trusted networks
  • The browser runs with your user permissions

If Cosine doesn’t detect the browser:

  1. Verify Chrome is running with debugging:

    Terminal window
    # Check if port 9222 is open
    curl http://localhost:9222/json/version
  2. Verify the CDP URL is correct:

    • Use http://localhost:9222 (not ws://)
    • Don’t include /json or other paths
  3. Check for conflicting Chrome instances:

    • Close all Chrome windows
    • Start Chrome with the debugging flags shown above

If you see “connection refused” errors:

  1. Make sure Chrome is actually running
  2. Verify the port matches (default is 9222)
  3. Check firewall settings

If pages fail to load:

  1. Verify network connectivity
  2. Check if the URL is accessible in regular Chrome
  3. Look for SSL certificate issues

If Chrome is installed in a non-standard location:

Terminal window
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 --user-data-dir=/tmp/cosine-chrome

You can run multiple Chrome instances on different ports:

Terminal window
# First instance on port 9222
chrome --remote-debugging-port=9222 --user-data-dir=/tmp/cosine-chrome-1
# Second instance on port 9223
chrome --remote-debugging-port=9223 --user-data-dir=/tmp/cosine-chrome-2

Then configure Cosine:

Terminal window
# Use first instance
cos start --cdp-url http://localhost:9222
# Use second instance
cos start --cdp-url http://localhost:9223

For servers or CI/CD environments, use headless mode:

Terminal window
chrome --headless --remote-debugging-port=9222 --user-data-dir=/tmp/cosine-chrome

You can combine browser automation with MCP servers for powerful workflows:

{
"mcpServers": {
"filesystem": {
"transport": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/pz/screenshots"]
}
}
}

Now you can:

You: Take a screenshot of the dashboard and save it to my screenshots folder
Cosine: I'll capture the screenshot and save it using the filesystem MCP server.

Always start Chrome with debugging enabled before launching Cosine:

Terminal window
# Terminal 1: Start Chrome
open -n -a "Google Chrome" --args --remote-debugging-port=9222 --user-data-dir=/tmp/cosine-chrome
# Terminal 2: Start Cosine
cos start

Keep Cosine’s browser sessions separate from your main browsing:

Terminal window
# Use a dedicated user data directory
--user-data-dir=/tmp/cosine-chrome

Clean up the Chrome instance when finished:

Terminal window
# Find and kill the Chrome process
pkill -f "remote-debugging-port=9222"
# Or remove the user data directory
rm -rf /tmp/cosine-chrome

Browser automation can be resource-intensive:

  • Chrome uses significant RAM
  • Close unused tabs
  • Kill the Chrome process when done
You: Compare the current homepage design with the staging site
Cosine: I'll take screenshots of both and compare them.
You: Extract pricing information from 5 competitor websites
Cosine: I'll visit each site and extract their pricing tables.
You: Test the signup flow on our website
Cosine: I'll navigate through the signup process and report any issues.