ChatOllama - Browser Use

Overview

ChatOllama provides integration with locally running Ollama models, enabling completely private and offline browser automation without sending data to external APIs.

Basic Usage

from browser_use import Agent, ChatOllama
import asyncio

async def main():
    llm = ChatOllama(model='llama3.2')
    agent = Agent(
        task="Find the number 1 post on Show HN",
        llm=llm,
    )
    await agent.run()

if __name__ == "__main__":
    asyncio.run(main())

Prerequisites

Install Ollama: Download from ollama.com
Pull a model: ollama pull llama3.2
Start Ollama: It runs automatically after installation

# Pull recommended models
ollama pull llama3.2
ollama pull llama3.2:70b
ollama pull qwen2.5-coder:32b

# Verify Ollama is running
curl http://localhost:11434

Configuration

Required Parameters

model

str

required

Ollama model name. Popular options:

llama3.2: Fast and capable
llama3.2:70b: More powerful
qwen2.5-coder:32b: Great for web tasks
mistral: Alternative option
codellama: Coding focused

Client Parameters

host

str

default:"None"

Ollama server URL. Defaults to http://localhost:11434.

timeout

float

default:"None"

Request timeout in seconds.

client_params

dict

default:"None"

Additional parameters for the Ollama client.

ollama_options

Options

default:"None"

Ollama-specific options for model behavior.Common options:

temperature: Sampling temperature
num_predict: Max tokens to generate
top_k: Top-K sampling
top_p: Top-P sampling
repeat_penalty: Repetition penalty

Advanced Usage

Custom Ollama Host

from browser_use import Agent, ChatOllama

# Connect to remote Ollama instance
llm = ChatOllama(
    model='llama3.2',
    host='http://192.168.1.100:11434',
)

agent = Agent(task="Your task", llm=llm)

With Ollama Options

from browser_use import Agent, ChatOllama
from ollama import Options

llm = ChatOllama(
    model='llama3.2',
    ollama_options=Options(
        temperature=0.7,
        num_predict=2048,
        top_k=40,
        top_p=0.9,
        repeat_penalty=1.1,
    ),
)

agent = Agent(task="Your task", llm=llm)

Structured Output

from browser_use import Agent, ChatOllama
from pydantic import BaseModel

class SearchResult(BaseModel):
    title: str
    description: str
    url: str

llm = ChatOllama(model='llama3.2')

agent = Agent(
    task="Extract search result",
    llm=llm,
    output_model_schema=SearchResult,
)

result = await agent.run()
print(result.structured_output)  # SearchResult instance

Custom Timeout for Large Models

from browser_use import Agent, ChatOllama
import httpx

llm = ChatOllama(
    model='llama3.2:70b',
    timeout=300.0,  # 5 minutes for large model
)

agent = Agent(task="Complex task", llm=llm)

Using Dictionary Options

from browser_use import Agent, ChatOllama

llm = ChatOllama(
    model='qwen2.5-coder:32b',
    ollama_options={
        'temperature': 0.2,
        'num_predict': 4096,
        'top_p': 0.95,
    },
)

agent = Agent(task="Your task", llm=llm)

Setup Guide

macOS

# Install Ollama
brew install ollama

# Or download from ollama.com
curl -fsSL https://ollama.com/install.sh | sh

# Start service
ollama serve

# Pull model
ollama pull llama3.2

Linux

# Install
curl -fsSL https://ollama.com/install.sh | sh

# Start service (usually auto-starts)
sudo systemctl start ollama

# Pull model
ollama pull llama3.2

Windows

Download installer from ollama.com
Run installer
Open terminal and run: ollama pull llama3.2

Docker

# Run Ollama in Docker
docker run -d \
  --name ollama \
  -p 11434:11434 \
  -v ollama:/root/.ollama \
  ollama/ollama

# Pull model
docker exec ollama ollama pull llama3.2

Error Handling

from browser_use import Agent, ChatOllama
from browser_use.llm.exceptions import ModelProviderError
import httpx

try:
    llm = ChatOllama(model='llama3.2')
    agent = Agent(task="Your task", llm=llm)
    result = await agent.run()
except ModelProviderError as e:
    print(f"Ollama error: {e.message}")
    print("Make sure Ollama is running: ollama serve")
except httpx.ConnectError:
    print("Cannot connect to Ollama. Is it running?")
    print("Start with: ollama serve")

Properties

provider

Returns the provider name: "ollama"

llm = ChatOllama(model='llama3.2')
print(llm.provider)  # "ollama"

name

Returns the model name.

llm = ChatOllama(model='llama3.2')
print(llm.name)  # "llama3.2"

Methods

get_client()

Returns an OllamaAsyncClient instance.

llm = ChatOllama(model='llama3.2')
client = llm.get_client()
# Use client directly for advanced operations

ainvoke()

Asynchronously invoke the model with messages.

from browser_use.llm.messages import SystemMessage, UserMessage

llm = ChatOllama(model='llama3.2')

messages = [
    SystemMessage(content="You are a helpful assistant"),
    UserMessage(content="What is Browser Use?")
]

response = await llm.ainvoke(messages)
print(response.completion)  # String response

Parameters

messages (list[BaseMessage]): List of messages
output_format (type[T] | None): Optional Pydantic model for structured output

Returns

ChatInvokeCompletion[T] | ChatInvokeCompletion[str] with:

completion: Response content (string or structured output)
usage: Currently None for Ollama (not tracked)

Ollama does not currently provide token usage information in responses.

Recommended Models

For Speed

llama3.2 (8B): Fast, good quality
qwen2.5-coder (7B): Great for web tasks
mistral (7B): Balanced performance

For Quality

llama3.2:70b: Best quality, slower
qwen2.5-coder:32b: Excellent for browser automation
mixtral:8x7b: High quality mixture of experts

For Resource-Constrained

llama3.2:3b: Very fast on CPU
phi3: Microsoft’s efficient model
tinyllama: Minimal resource usage

# Check model sizes
ollama list

# Remove unused models
ollama rm model-name

Performance Tips

GPU Acceleration: Ollama automatically uses GPU if available
Model Size: Smaller models are faster but less capable
num_predict: Limit output tokens for faster responses
Preload Models: Models load faster after first use

# Optimize for speed
llm = ChatOllama(
    model='llama3.2',
    ollama_options={
        'num_predict': 512,  # Limit output length
        'num_ctx': 2048,     # Smaller context window
    },
)

Troubleshooting

Ollama Not Running

# Check if Ollama is running
curl http://localhost:11434

# Start Ollama
ollama serve

# Or on Linux with systemd
sudo systemctl start ollama
sudo systemctl status ollama

Model Not Found

# List installed models
ollama list

# Pull missing model
ollama pull llama3.2

Connection Refused

# Verify correct host
llm = ChatOllama(
    model='llama3.2',
    host='http://localhost:11434',  # Default
)

Slow Performance

# Use smaller model
ollama pull llama3.2:3b

# Check GPU usage
nvidia-smi  # For NVIDIA GPUs

# Reduce context size in options

Benefits of Ollama

Privacy: All data stays on your machine
No API Costs: Free to use
Offline Capable: Works without internet
Fast: Low latency on local hardware
Customizable: Full control over models and parameters

Limitations

No Usage Tracking: Token counts not available
Hardware Dependent: Performance varies by hardware
Model Quality: May not match GPT-4 or Claude for complex tasks
Setup Required: Need to install and manage Ollama

ChatBrowserUse - Recommended for production
ChatOpenAI
Ollama Documentation

Core Classes

LLM Providers

Actions

Configuration

Documentation Index

​Overview

​Basic Usage

​Prerequisites

​Configuration

​Required Parameters

​Client Parameters

​Advanced Usage

​Custom Ollama Host

​With Ollama Options

​Structured Output

​Custom Timeout for Large Models

​Using Dictionary Options

​Setup Guide

​macOS

​Linux

​Windows

​Docker

​Error Handling

​Properties

​provider

​name

​Methods

​get_client()

​ainvoke()

​Parameters

​Returns

​Recommended Models

​For Speed

​For Quality

​For Resource-Constrained

​Performance Tips

​Troubleshooting

​Ollama Not Running

​Model Not Found

​Connection Refused

​Slow Performance

​Benefits of Ollama

​Limitations

​Related

Overview

Basic Usage

Prerequisites

Configuration

Required Parameters

Client Parameters

Advanced Usage

Custom Ollama Host

With Ollama Options

Structured Output

Custom Timeout for Large Models

Using Dictionary Options

Setup Guide

macOS

Linux

Windows

Docker

Error Handling

Properties

provider

name

Methods

get_client()

ainvoke()

Parameters

Returns

Recommended Models

For Speed

For Quality

For Resource-Constrained

Performance Tips

Troubleshooting

Ollama Not Running

Model Not Found

Connection Refused

Slow Performance

Benefits of Ollama

Limitations

Related