CodeAgent

The CodeAgent provides a Jupyter notebook-like interface where the LLM writes Python code that gets executed in a persistent namespace with browser control functions available.

Overview

Unlike the standard Agent which uses predefined actions, CodeAgent gives the LLM the ability to write and execute Python code directly. This provides more flexibility for complex automation tasks that require custom logic, data processing, or multi-step workflows.

Basic Example

from browser_use import ChatBrowserUse
from browser_use.code_use.service import CodeAgent
import asyncio

async def main():
    agent = CodeAgent(
        task="Go to news.ycombinator.com and extract the top 5 story titles",
        llm=ChatBrowserUse(),
    )
    
    result = await agent.run()
    print(result)

if __name__ == "__main__":
    asyncio.run(main())

Constructor

task

str

required

The task description for the agent to complete.

llm

BaseChatModel | None

The language model to use. If not provided, defaults to ChatBrowserUse().Note: CodeAgent currently only works with ChatBrowserUse.

browser

BrowserSession | None

Browser session instance. If not provided, a new browser will be created automatically.

tools

Tools | None

Custom tools registry. If not provided, uses default CodeAgentTools() which includes browser control functions.

page_extraction_llm

BaseChatModel | None

Separate LLM model for page content extraction. Useful for using a faster/cheaper model for extraction.

file_system

FileSystem | None

File system instance for file operations. Defaults to FileSystem(base_dir='./').

available_file_paths

list[str] | None

List of file paths the agent can access.

sensitive_data

dict[str, str | dict[str, str]] | None

Dictionary containing sensitive data that should be handled carefully.

max_steps

int

default:"100"

Maximum number of execution steps before terminating.

max_failures

int

default:"8"

Maximum consecutive errors before auto-termination.

max_validations

int

default:"0"

Maximum number of times to run the validator agent to verify task completion.

use_vision

bool

default:"True"

Whether to include screenshots in LLM messages for visual context.

calculate_cost

bool

default:"False"

Whether to calculate and track token costs.

demo_mode

bool | None

Enable the in-browser demo panel for live logging and visualization.

Methods

run

Execute the agent to complete the task.

result = await agent.run(max_steps=100)

max_steps

int | None

Optional override for maximum number of steps. Uses constructor value if not provided.

Returns: NotebookSession - A notebook session containing all executed code cells, outputs, and browser states.

close

Close the browser session.

await agent.close()

How CodeAgent Differs from Agent

Agent (Standard)

Uses predefined actions (click, type, navigate, etc.)
LLM selects actions from a fixed set
Structured, predictable behavior
Best for straightforward automation tasks

CodeAgent

LLM writes custom Python code
Full programming flexibility
Access to Python libraries (json, csv, re, etc.)
Persistent namespace across cells
Best for complex logic, data processing, custom workflows

Available Functions in Namespace

The CodeAgent executes code in a namespace with browser control functions available:

await navigate("https://example.com")
await back()
await refresh()

Element Interaction

await click(index)        # Click element by index
await type_text(text)     # Type into focused element
await input_text(index, text)  # Click element then type
await scroll(direction="down", amount=500)

Information Retrieval

state = await get_state()  # Get browser state with elements
title = await get_title()
html = await get_html()
text = await get_text(index)

JavaScript Execution

result = await evaluate("document.title")

Task Completion

await done(result="Task completed", success=True)

The done() function must be called to mark the task as complete. Without it, the agent will continue until max_steps is reached.

Code Execution Features

Persistent Namespace

Variables persist across code cells, just like Jupyter notebooks:

# Cell 1
data = []

# Cell 2 (can access data from Cell 1)
data.append({"title": "Example"})

# Cell 3
print(len(data))  # Prints: 1

Top-Level Await

You can use await at the top level without wrapping in async functions:

# This works in CodeAgent
result = await navigate("https://example.com")
state = await get_state()

Multiple Code Block Types

The LLM can generate multiple types of code blocks:

```python
# Python code gets executed
data = await extract("Get product names")
```

```js
// JavaScript code is stored in namespace as 'js' variable
document.querySelectorAll('.product')
```

```bash
# Bash code stored as 'bash' variable
ls -la
```

Non-Python blocks are injected as string variables in the namespace for reference.

Result Object

The run() method returns a NotebookSession object:

result = await agent.run()

# Access notebook cells
for cell in result.cells:
    print(f"Cell {cell.execution_count}: {cell.status}")
    print(f"Code: {cell.source}")
    print(f"Output: {cell.output}")
    print(f"Error: {cell.error}")

# Get complete history with metadata
history = result._complete_history  # List[CodeAgentHistory]

# Get usage summary
usage = result._usage_summary
print(f"Total tokens: {usage.total_tokens}")
print(f"Total cost: {usage.total_cost}")

NotebookSession Properties

cells - List of executed code cells with outputs
execution_count - Current execution count
_complete_history - Complete execution history with metadata
_usage_summary - Token usage and cost summary

Cell Properties

source - The Python code that was executed
output - Captured stdout output
error - Error message if execution failed
status - Execution status: SUCCESS, ERROR, or RUNNING
execution_count - Cell execution number
browser_state - Browser state text at time of execution
cell_type - Cell type: CODE or MARKDOWN

Advanced Examples

Data Extraction with Processing

async def extract_and_process():
    agent = CodeAgent(
        task="""
        1. Go to news.ycombinator.com
        2. Extract the top 10 story titles and URLs
        3. Filter for stories about AI
        4. Save to ai_stories.json
        """,
        llm=ChatBrowserUse(),
    )
    
    result = await agent.run(max_steps=50)
    return result

The agent will write code similar to:

import json

# Navigate and extract
await navigate("https://news.ycombinator.com")
stories = await extract("Get top 10 story titles and URLs as JSON array")

# Process with Python
stories = json.loads(stories)
ai_stories = [s for s in stories if 'ai' in s['title'].lower()]

# Save to file
with open('ai_stories.json', 'w') as f:
    json.dump(ai_stories, f, indent=2)

await done(f"Found {len(ai_stories)} AI stories", success=True)

Form Filling with Validation

async def fill_form():
    agent = CodeAgent(
        task="""
        Fill out the contact form at https://example.com/contact:
        - Name: John Doe
        - Email: john@example.com
        - Message: Test message
        Then verify the success message appears.
        """,
        llm=ChatBrowserUse(),
        sensitive_data={
            "email": "john@example.com"
        }
    )
    
    result = await agent.run()
    return result

Multi-Page Workflow

async def complex_workflow():
    agent = CodeAgent(
        task="""
        1. Search for 'Python tutorials' on Google
        2. Open the first 3 results in new tabs
        3. Extract the main heading from each page
        4. Create a summary comparing the three tutorials
        """,
        llm=ChatBrowserUse(),
        use_vision=True,  # Use screenshots for better understanding
    )
    
    result = await agent.run(max_steps=100)
    return result

Error Handling

Consecutive Error Limit

The agent tracks consecutive errors and terminates after max_failures (default: 8):

agent = CodeAgent(
    task="Your task",
    llm=ChatBrowserUse(),
    max_failures=5,  # Terminate after 5 consecutive errors
)

Validation

Enable task completion validation to ensure the agent actually completed the task:

agent = CodeAgent(
    task="Extract product data",
    llm=ChatBrowserUse(),
    max_validations=2,  # Validate up to 2 times
)

If validation fails, the agent receives feedback and continues working.

Best Practices

1. Clear Task Descriptions

Be specific about what you want:

# ✅ Good
task = "Go to example.com, click the login button, enter credentials, and verify successful login"

# ❌ Too vague
task = "Login to website"

2. Use Sensitive Data Parameter

Keep credentials safe:

agent = CodeAgent(
    task="Login to website",
    sensitive_data={
        "username": "user@example.com",
        "password": "secret123"
    }
)

3. Enable Vision for Visual Tasks

agent = CodeAgent(
    task="Verify the layout looks correct",
    use_vision=True,  # LLM can see screenshots
)

4. Set Appropriate Limits

agent = CodeAgent(
    task="Simple task",
    max_steps=20,        # Prevent excessive execution
    max_failures=3,      # Fail fast on repeated errors
)

5. Track Costs

agent = CodeAgent(
    task="Your task",
    calculate_cost=True,  # Track token usage and costs
)

result = await agent.run()
print(f"Total cost: ${result._usage_summary.total_cost}")

Comparison with Standard Agent

Feature	Agent	CodeAgent
Execution Model	Predefined actions	Custom Python code
Flexibility	Fixed action set	Full Python capabilities
Data Processing	Limited	Full Python libraries
Learning Curve	Easier to understand	Requires Python knowledge
Predictability	More predictable	Less predictable
Use Cases	Standard automation	Complex workflows
LLM Support	Multiple LLMs	ChatBrowserUse only

Troubleshooting

Agent Doesn’t Call done()

Make sure your task is clear about when to finish:

task = "Extract data and call done() with the results"

Variables Not Persisting

Variables should persist automatically. If they don’t, check for:

Syntax errors in code
Scope issues with function definitions

Browser State Not Updating

The browser state is fetched before each LLM call. If you need to force a refresh:

state = await get_state()  # Force state refresh

Token Limit Errors

If you hit token limits:

Reduce max_steps
Use page_extraction_llm with a smaller model
Disable use_vision if screenshots aren’t needed

Agent Basics - Standard agent with predefined actions
Browser API - Browser configuration options
Tools - Custom tool development
Going to Production - Deploy agents at scale

Core Classes

LLM Providers

Actions

Configuration

Documentation Index

​Overview

​Basic Example

​Constructor

​CodeAgent

​Methods

​run

​close

​How CodeAgent Differs from Agent

​Agent (Standard)

​CodeAgent

​Available Functions in Namespace

​Browser Navigation

​Element Interaction

​Information Retrieval

​JavaScript Execution

​Task Completion

​Code Execution Features

​Persistent Namespace

​Top-Level Await

​Multiple Code Block Types

​Result Object

​NotebookSession Properties

​Cell Properties

​Advanced Examples

​Data Extraction with Processing

​Form Filling with Validation

​Multi-Page Workflow

​Error Handling

​Consecutive Error Limit

​Validation

​Best Practices

​1. Clear Task Descriptions

​2. Use Sensitive Data Parameter

​3. Enable Vision for Visual Tasks

​4. Set Appropriate Limits

​5. Track Costs

​Comparison with Standard Agent

​Troubleshooting

​Agent Doesn’t Call done()

​Variables Not Persisting

​Browser State Not Updating

​Token Limit Errors

​Related Documentation

Overview

Basic Example

Constructor

CodeAgent

Methods

run

close

How CodeAgent Differs from Agent

Agent (Standard)

CodeAgent

Available Functions in Namespace

Browser Navigation

Element Interaction

Information Retrieval

JavaScript Execution

Task Completion

Code Execution Features

Persistent Namespace

Top-Level Await

Multiple Code Block Types

Result Object

NotebookSession Properties

Cell Properties

Advanced Examples

Data Extraction with Processing

Form Filling with Validation

Multi-Page Workflow

Error Handling

Consecutive Error Limit

Validation

Best Practices

1. Clear Task Descriptions

2. Use Sensitive Data Parameter

3. Enable Vision for Visual Tasks

4. Set Appropriate Limits

5. Track Costs

Comparison with Standard Agent

Troubleshooting

Agent Doesn’t Call done()

Variables Not Persisting

Browser State Not Updating

Token Limit Errors

Related Documentation