Documentation Index
Fetch the complete documentation index at: https://mintlify.com/browser-use/browser-use/llms.txt
Use this file to discover all available pages before exploring further.
Overview
ChatOllama provides integration with locally running Ollama models, enabling completely private and offline browser automation without sending data to external APIs.Basic Usage
Prerequisites
- Install Ollama: Download from ollama.com
- Pull a model:
ollama pull llama3.2 - Start Ollama: It runs automatically after installation
Configuration
Required Parameters
Ollama model name. Popular options:
llama3.2: Fast and capablellama3.2:70b: More powerfulqwen2.5-coder:32b: Great for web tasksmistral: Alternative optioncodellama: Coding focused
Client Parameters
Ollama server URL. Defaults to
http://localhost:11434.Request timeout in seconds.
Additional parameters for the Ollama client.
Ollama-specific options for model behavior.Common options:
temperature: Sampling temperaturenum_predict: Max tokens to generatetop_k: Top-K samplingtop_p: Top-P samplingrepeat_penalty: Repetition penalty
Advanced Usage
Custom Ollama Host
With Ollama Options
Structured Output
Custom Timeout for Large Models
Using Dictionary Options
Setup Guide
macOS
Linux
Windows
- Download installer from ollama.com
- Run installer
- Open terminal and run:
ollama pull llama3.2
Docker
Error Handling
Properties
provider
Returns the provider name:"ollama"
name
Returns the model name.Methods
get_client()
Returns anOllamaAsyncClient instance.
ainvoke()
Asynchronously invoke the model with messages.Parameters
- messages (
list[BaseMessage]): List of messages - output_format (
type[T] | None): Optional Pydantic model for structured output
Returns
ChatInvokeCompletion[T] | ChatInvokeCompletion[str] with:
completion: Response content (string or structured output)usage: CurrentlyNonefor Ollama (not tracked)
Ollama does not currently provide token usage information in responses.
Recommended Models
For Speed
- llama3.2 (8B): Fast, good quality
- qwen2.5-coder (7B): Great for web tasks
- mistral (7B): Balanced performance
For Quality
- llama3.2:70b: Best quality, slower
- qwen2.5-coder:32b: Excellent for browser automation
- mixtral:8x7b: High quality mixture of experts
For Resource-Constrained
- llama3.2:3b: Very fast on CPU
- phi3: Microsoft’s efficient model
- tinyllama: Minimal resource usage
Performance Tips
- GPU Acceleration: Ollama automatically uses GPU if available
- Model Size: Smaller models are faster but less capable
- num_predict: Limit output tokens for faster responses
- Preload Models: Models load faster after first use
Troubleshooting
Ollama Not Running
Model Not Found
Connection Refused
Slow Performance
Benefits of Ollama
- Privacy: All data stays on your machine
- No API Costs: Free to use
- Offline Capable: Works without internet
- Fast: Low latency on local hardware
- Customizable: Full control over models and parameters
Limitations
- No Usage Tracking: Token counts not available
- Hardware Dependent: Performance varies by hardware
- Model Quality: May not match GPT-4 or Claude for complex tasks
- Setup Required: Need to install and manage Ollama
Related
- ChatBrowserUse - Recommended for production
- ChatOpenAI
- Ollama Documentation