TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/browser-use/browser-use/llms.txt
Use this file to discover all available pages before exploring further.
Agent class is the main entry point for Browser Use. It orchestrates the LLM, browser interactions, and tools to complete tasks autonomously.
Constructor
Parameters
The task description for the agent to complete.
Language model instance. Defaults to
ChatBrowserUse() if not provided. See Supported Models.Browser instance to use. If not provided, a new browser will be created with default settings.
Deprecated: Use
browser parameter instead. Alias for backward compatibility.Registry of tools (actions) the agent can use. If not provided, default tools are loaded. See Tools.
Deprecated: Use
tools parameter instead. Alias for backward compatibility.LLM Configuration
Separate LLM for page content extraction. Use a smaller/faster model for efficiency. Defaults to main
llm.LLM for judging agent trace quality. Defaults to main
llm.Fallback LLM to use if the primary LLM fails.
Vision & Screenshots
Vision mode:
True: Always include screenshots in LLM contextFalse: Never include screenshots, excludes screenshot tool'auto': Include screenshot tool but only use vision when requested
Screenshot detail level for vision models.
Target size
(width, height) to resize screenshots before sending to LLM. Coordinates from LLM are automatically scaled back to original viewport size.Skills Integration
List of skill IDs to enable, or
['*'] for all skills. Skills are pre-built actions from the cloud.Deprecated: Use
skills parameter instead. Alias for backward compatibility.Pre-configured skill service instance for advanced use cases.
Actions & Behavior
List of actions to execute before starting the main task (without LLM). Format:
[{'action_name': {'param': value}}]Maximum actions the agent can output per step (e.g., for form filling).
Maximum consecutive failures before stopping.
If
True, agent attempts one final recovery call after reaching max_failures.Enable explicit reasoning steps in agent output.
Fast mode that skips evaluation, planning, and thinking. Overrides
use_thinking and enable_planning when enabled.If
True, automatically navigate to URLs detected in the task.Planning
Enable agent planning with step-by-step todo items.
Number of consecutive failures before suggesting plan revision. Set to
0 to disable.Number of steps without a plan before nudging agent to create one. Set to
0 to disable.Loop Detection
Enable detection of repetitive action patterns.
Rolling window size for tracking action similarity.
System Messages
Completely replace the default system prompt.
Add additional instructions to the default system prompt.
File & Data Management
Directory path to save conversation history.
Encoding for saved conversations.
List of file paths the agent can access for upload actions.
Path for agent’s file system operations.
Show file information in completion messages.
Dictionary of sensitive data to handle securely. Format:
{key: value} or {domain: {key: value}}.Output Format
Pydantic model class for structured output validation. See Custom Output.
JSON schema for data extraction. Auto-detected from
output_model_schema if not provided.Visual Output
Generate GIF of agent actions. Set to
True or a file path string.List of HTML attributes to include in DOM analysis.
Performance & Limits
Maximum number of recent steps to keep in LLM memory.
None keeps all steps.Timeout in seconds for LLM calls. Auto-detected based on model.
Timeout in seconds for each agent step.
Compact old messages to reduce prompt size. Set to
False to disable or provide MessageCompactionSettings for custom configuration.Maximum characters for clickable elements in prompt.
Judge & Validation
Enable post-execution judge to validate task completion.
Ground truth answer for judge validation.
Cloud Callbacks
Callback function called after each step. Signature:
(BrowserStateSummary, AgentOutput, int) -> None | Awaitable[None]Callback function called when agent completes. Signature:
(AgentHistoryList) -> None | Awaitable[None]Callback to check if agent should stop. Returns
True to stop.Callback to check external agent status. Raises
InterruptedError if returns True.Advanced Options
Calculate and track API token costs.
Include tool usage examples in system prompt.
Include recent browser events in context.
Sample images to include in prompts for vision models.
Enable demo mode with browser overlay UI.
Custom task ID. Auto-generated if not provided.
Pre-existing agent state for resuming sessions.
Source identifier for telemetry.
Methods
run()
Execute the agent to complete the task.Maximum number of steps the agent can take.
Complete execution history with results, screenshots, and metadata.
step()
Execute a single step of the task.Optional step information including step number and max steps.
add_new_task()
Add a follow-up task to the agent.The new task description.
stop()
Stop the agent execution gracefully.kill()
Force-stop the agent and clean up resources.Properties
state
Current agent state including step counter, failures, and internal state.
history
Complete history of agent actions and results.
browser_session
The browser session instance being used.
tools
The tools registry containing all available actions.
settings
Agent configuration settings.