Extraction Actions

Extraction actions retrieve information from web pages using LLM-powered extraction or visual snapshots.

extract

Use an LLM to extract structured or free-text data from the current page’s markdown content.

query

string

required

Description of what data to extract from the page

extract_links

bool

default:"False"

Set to True if the query requires URLs/links, False to save tokens

start_from_char

int

default:"0"

Character position to start extraction from. Use this for long pages when previous extraction was truncated.Note: This is a character offset in the markdown content, NOT an element index from browser_state.

output_schema

dict

Optional JSON Schema dictionary. When provided, extraction returns validated JSON matching this schema instead of free-text.See Structured Output Example

Free-text Extraction Example

from browser_use import Agent, Browser, ChatBrowserUse

agent = Agent(
    task="Go to quotes.toscrape.com and extract the first 3 quotes with their authors",
    llm=ChatBrowserUse(),
    browser=Browser()
)

history = await agent.run()
print(history.final_result())

Structured Extraction Example

from browser_use import Agent, Browser, ChatBrowserUse
from pydantic import BaseModel

class Quote(BaseModel):
    text: str
    author: str

class QuotesOutput(BaseModel):
    quotes: list[Quote]

agent = Agent(
    task="Extract the first 5 quotes from quotes.toscrape.com",
    llm=ChatBrowserUse(),
    browser=Browser(),
    output_model_schema=QuotesOutput
)

history = await agent.run()
structured_data = history.structured_output
print(structured_data.quotes)

When to use extract:

You’re on the right page
You know what data to extract
You haven’t called extract before on the same page for the same query

Limitations:

Cannot extract interactive elements (use browser_state for that)
Large content may be truncated (use start_from_char to continue)

The extracted content is converted to markdown before extraction, which filters out advertising and noise but removes some interactive elements.

Implementation: browser_use.tools.service:951

screenshot

Take a screenshot of the current viewport.

file_name

string

Optional filename to save the screenshot. If provided, saves to file and returns the path.If omitted, the screenshot will be included in the next browser_state observation.Supported format: PNG (.png extension added automatically)

Save to File Example

from browser_use import Agent, Browser, ChatBrowserUse

agent = Agent(
    task="Go to example.com and save a screenshot as 'homepage.png'",
    llm=ChatBrowserUse(),
    browser=Browser()
)

await agent.run()

Include in Observation Example

from browser_use import Agent, Browser, ChatBrowserUse

agent = Agent(
    task="Go to example.com and take a screenshot to verify the page loaded",
    llm=ChatBrowserUse(),
    browser=Browser()
)

await agent.run()

Screenshots are useful for:

Visual confirmation of page state
Debugging issues
Creating documentation
Verifying forms filled correctly

Implementation: browser_use.tools.service:1387

search_page

Search page text for a pattern instantly with zero LLM cost (like grep).

pattern

string

required

Text or regex pattern to search for in page content

regex

bool

default:"False"

Treat pattern as regex (default: literal text match)

case_sensitive

bool

default:"False"

Case-sensitive search (default: case-insensitive)

context_chars

int

default:"150"

Characters of surrounding context per match

css_scope

string

CSS selector to limit search scope (e.g., "div#main")

max_results

int

default:"25"

Maximum matches to return

Example

from browser_use import Agent, Browser, ChatBrowserUse

agent = Agent(
    task='Go to wikipedia.org and search for "quantum" on the page',
    llm=ChatBrowserUse(),
    browser=Browser()
)

await agent.run()

Zero LLM cost - This action executes JavaScript directly in the browser for instant results.

Implementation: browser_use.tools.service:1169

find_elements

Query DOM elements by CSS selector instantly with zero LLM cost.

selector

string

required

CSS selector to query elements (e.g., "table tr", "a.link", "div.product")

attributes

list[string]

Specific attributes to extract (e.g., ["href", "src", "class"]).If not set, returns tag and text only.

max_results

int

default:"50"

Maximum elements to return

include_text

bool

default:"True"

Include text content of each element

Example

from browser_use import Agent, Browser, ChatBrowserUse

agent = Agent(
    task='Go to example.com and find all links with their href attributes',
    llm=ChatBrowserUse(),
    browser=Browser()
)

await agent.run()

Use find_elements to:

Explore page structure
Count items
Get links/attributes
Verify elements exist

Implementation: browser_use.tools.service:1206

Navigation Actions - Navigate between pages
Interaction Actions - Interact with elements

Core Classes

LLM Providers

Actions

Configuration

extract

Free-text Extraction Example

Structured Extraction Example

screenshot

Save to File Example

Include in Observation Example

search_page

Example

find_elements

Example

Core Classes

LLM Providers

Actions

Configuration

Documentation Index

​extract

​Free-text Extraction Example

​Structured Extraction Example

​screenshot

​Save to File Example

​Include in Observation Example

​search_page

​Example

​find_elements

​Example

​Related Actions

extract

Free-text Extraction Example

Structured Extraction Example

screenshot

Save to File Example

Include in Observation Example

search_page

Example

find_elements

Example

Related Actions