Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/browser-use/browser-use/llms.txt

Use this file to discover all available pages before exploring further.

Interaction actions allow the agent to manipulate page elements like a human user.

click

Click on an element by its index from the browser state.
index
int
required
Element index from browser_state. Must be ≥ 1.Element at index 0 represents the entire page and cannot be clicked.
coordinate_x
int
Horizontal coordinate relative to viewport left edge (optional, for coordinate-based clicking)
coordinate_y
int
Vertical coordinate relative to viewport top edge (optional, for coordinate-based clicking)

Example

from browser_use import Agent, Browser, ChatBrowserUse

agent = Agent(
    task="Go to example.com and click the first link",
    llm=ChatBrowserUse(),
    browser=Browser()
)

await agent.run()
The agent automatically detects new tabs opened by clicks and provides information about them.
Implementation: browser_use.tools.service:566 (index-based), browser_use.tools.service:521 (coordinate-based)

input

Type text into an input field.
index
int
required
Element index from browser_state pointing to an input field
text
string
required
The text to type into the field
clear
bool
default:"True"
Whether to clear the field before typing. Set to False to append text.

Example

from browser_use import Agent, Browser, ChatBrowserUse

agent = Agent(
    task="Go to google.com and search for 'browser use'",
    llm=ChatBrowserUse(),
    browser=Browser()
)

await agent.run()
For autocomplete fields, wait for suggestions to appear before selecting. The action will add a brief delay for JavaScript-driven autocomplete.
Implementation: browser_use.tools.service:639

scroll

Scroll the page or a specific element up or down.
down
bool
default:"True"
required
Direction to scroll. True = scroll down, False = scroll up
pages
float
default:"1.0"
Number of pages to scroll.
  • 0.5 = half page
  • 1.0 = full page
  • 10.0 = scroll to bottom/top
Range: 0.5 to 10.0
index
int
Optional element index to scroll within a specific element (e.g., dropdowns, custom scrollable containers)

Example

from browser_use import Agent, Browser, ChatBrowserUse

agent = Agent(
    task="Go to hackernews and scroll down 3 pages",
    llm=ChatBrowserUse(),
    browser=Browser()
)

await agent.run()
Viewport height is automatically detected for accurate scrolling. Multi-page scrolls execute sequentially to ensure each completes.
Implementation: browser_use.tools.service:1241

send_keys

Send special keyboard keys or shortcuts to the page.
keys
string
required
Keys to send. Can be:
  • Special keys: Escape, Enter, Tab, PageDown, PageUp, ArrowDown, ArrowUp, ArrowLeft, ArrowRight
  • Key combinations: Control+c, Control+v, Control+a
  • Or any other keyboard input

Example

from browser_use import Agent, Browser, ChatBrowserUse

agent = Agent(
    task="Press Tab three times then Enter",
    llm=ChatBrowserUse(),
    browser=Browser()
)

await agent.run()
Use send_keys for keyboard navigation when buttons can’t be clicked or for form submission shortcuts.
Implementation: browser_use.tools.service:1346