· Dave Brewster (dave@augustdata.ai) · logicunit · 1 min read
Browser Logic Unit: Configuring an agent to scrape a web page
Learn how to use web scraping in an agent.
The Browser
tool is a LogicUnit that can be registered with the APU to scrape a web page. This tool is useful for extracting data from a web page that is not available through an API and is typically used in conjunction with the Search
LogicUnit.
The full specification of Browser
is:
The summarizer
field is used to specify the summarizer to use when extracting data from the web page. It defaults to using BeautifulSoup
, which is a popular Python library for parsing HTML and XML documents. Using this tool it extracts the relavant text from the web page that is suitable for passing to an LLM for processing.
It is important to note that this tool only scrapes static information. There will be a tool in the future that uses a headless browser to execute javscript and extract the dynamic contents of the web page, among other things.
This tool is typically used in conjunction with the Search
LogicUnit.