Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/browser-use/browser-use/llms.txt

Use this file to discover all available pages before exploring further.

Extraction actions retrieve information from web pages using LLM-powered extraction or visual snapshots.

extract

Use an LLM to extract structured or free-text data from the current page’s markdown content.
query
string
required
Description of what data to extract from the page
Set to True if the query requires URLs/links, False to save tokens
start_from_char
int
default:"0"
Character position to start extraction from. Use this for long pages when previous extraction was truncated.Note: This is a character offset in the markdown content, NOT an element index from browser_state.
output_schema
dict
Optional JSON Schema dictionary. When provided, extraction returns validated JSON matching this schema instead of free-text.See Structured Output Example

Free-text Extraction Example

from browser_use import Agent, Browser, ChatBrowserUse

agent = Agent(
    task="Go to quotes.toscrape.com and extract the first 3 quotes with their authors",
    llm=ChatBrowserUse(),
    browser=Browser()
)

history = await agent.run()
print(history.final_result())

Structured Extraction Example

from browser_use import Agent, Browser, ChatBrowserUse
from pydantic import BaseModel

class Quote(BaseModel):
    text: str
    author: str

class QuotesOutput(BaseModel):
    quotes: list[Quote]

agent = Agent(
    task="Extract the first 5 quotes from quotes.toscrape.com",
    llm=ChatBrowserUse(),
    browser=Browser(),
    output_model_schema=QuotesOutput
)

history = await agent.run()
structured_data = history.structured_output
print(structured_data.quotes)
When to use extract:
  • You’re on the right page
  • You know what data to extract
  • You haven’t called extract before on the same page for the same query
Limitations:
  • Cannot extract interactive elements (use browser_state for that)
  • Large content may be truncated (use start_from_char to continue)
The extracted content is converted to markdown before extraction, which filters out advertising and noise but removes some interactive elements.
Implementation: browser_use.tools.service:951

screenshot

Take a screenshot of the current viewport.
file_name
string
Optional filename to save the screenshot. If provided, saves to file and returns the path.If omitted, the screenshot will be included in the next browser_state observation.Supported format: PNG (.png extension added automatically)

Save to File Example

from browser_use import Agent, Browser, ChatBrowserUse

agent = Agent(
    task="Go to example.com and save a screenshot as 'homepage.png'",
    llm=ChatBrowserUse(),
    browser=Browser()
)

await agent.run()

Include in Observation Example

from browser_use import Agent, Browser, ChatBrowserUse

agent = Agent(
    task="Go to example.com and take a screenshot to verify the page loaded",
    llm=ChatBrowserUse(),
    browser=Browser()
)

await agent.run()
Screenshots are useful for:
  • Visual confirmation of page state
  • Debugging issues
  • Creating documentation
  • Verifying forms filled correctly
Implementation: browser_use.tools.service:1387

search_page

Search page text for a pattern instantly with zero LLM cost (like grep).
pattern
string
required
Text or regex pattern to search for in page content
regex
bool
default:"False"
Treat pattern as regex (default: literal text match)
case_sensitive
bool
default:"False"
Case-sensitive search (default: case-insensitive)
context_chars
int
default:"150"
Characters of surrounding context per match
css_scope
string
CSS selector to limit search scope (e.g., "div#main")
max_results
int
default:"25"
Maximum matches to return

Example

from browser_use import Agent, Browser, ChatBrowserUse

agent = Agent(
    task='Go to wikipedia.org and search for "quantum" on the page',
    llm=ChatBrowserUse(),
    browser=Browser()
)

await agent.run()
Zero LLM cost - This action executes JavaScript directly in the browser for instant results.
Implementation: browser_use.tools.service:1169

find_elements

Query DOM elements by CSS selector instantly with zero LLM cost.
selector
string
required
CSS selector to query elements (e.g., "table tr", "a.link", "div.product")
attributes
list[string]
Specific attributes to extract (e.g., ["href", "src", "class"]).If not set, returns tag and text only.
max_results
int
default:"50"
Maximum elements to return
include_text
bool
default:"True"
Include text content of each element

Example

from browser_use import Agent, Browser, ChatBrowserUse

agent = Agent(
    task='Go to example.com and find all links with their href attributes',
    llm=ChatBrowserUse(),
    browser=Browser()
)

await agent.run()
Use find_elements to:
  • Explore page structure
  • Count items
  • Get links/attributes
  • Verify elements exist
Implementation: browser_use.tools.service:1206