Firecrawl
Arcade.dev LLM tools for web scraping related tasks via Firecrawl
3.0.1Firecrawl is an Arcade.dev toolkit designed for efficient web scraping tasks. It leverages the Firecrawl API to perform various operations, enabling developers to streamline their data extraction processes.
Capabilities
- Perform asynchronous and synchronous website crawls
- Retrieve crawl status and data for ongoing or recent tasks
- Map an entire website from a single URL
- Scrape specific URLs and return data in various formats
- Cancel ongoing crawl jobs seamlessly
OAuth
No OAuth authentication is required. An API key is utilized for accessing the Firecrawl API.
Secrets
- API Key: FIRECRAWL_API_KEY, used for authenticating requests to the Firecrawl API.
Available tools(6)
| Tool name | Description | Secrets | |
|---|---|---|---|
Cancel an asynchronous crawl job that is in progress using the Firecrawl API. | 1 | ||
Crawl a website using Firecrawl. If the crawl is asynchronous, then returns the crawl ID.
If the crawl is synchronous, then returns the crawl data. | 1 | ||
Get the data of a Firecrawl 'crawl' that is either in progress or recently completed. | 1 | ||
Get the status of a Firecrawl 'crawl' that is either in progress or recently completed. | 1 | ||
Map a website from a single URL to a map of the entire website. | 1 | ||
Scrape a URL using Firecrawl and return the data in specified formats. | 1 |
Selected tools
No tools selected.
Click "Show all tools" to add tools.
Requirements
Select tools to see requirements
Firecrawl.CancelCrawl
Cancel an asynchronous crawl job that is in progress using the Firecrawl API.
Parameters
| Parameter | Type | Req. | Description |
|---|---|---|---|
crawl_id | string | Required | The ID of the asynchronous crawl job to cancel |
Requirements
Output
json— Cancellation status informationFirecrawl.CrawlWebsite
Crawl a website using Firecrawl. If the crawl is asynchronous, then returns the crawl ID. If the crawl is synchronous, then returns the crawl data.
Parameters
| Parameter | Type | Req. | Description |
|---|---|---|---|
url | string | Required | URL to crawl |
exclude_paths | array<string> | Optional | URL patterns to exclude from the crawl |
include_paths | array<string> | Optional | URL patterns to include in the crawl |
max_depth | integer | Optional | Maximum depth to crawl relative to the entered URL |
ignore_sitemap | boolean | Optional | Ignore the website sitemap when crawling |
limit | integer | Optional | Limit the number of pages to crawl |
allow_backward_links | boolean | Optional | Enable navigation to previously linked pages and enable crawling sublinks that are not children of the 'url' input parameter. |
allow_external_links | boolean | Optional | Allow following links to external websites |
webhook | string | Optional | The URL to send a POST request to when the crawl is started, updated and completed. |
async_crawl | boolean | Optional | Run the crawl asynchronously |
Requirements
Output
json— Crawl status and dataFirecrawl.GetCrawlData
Get the data of a Firecrawl 'crawl' that is either in progress or recently completed.
Parameters
| Parameter | Type | Req. | Description |
|---|---|---|---|
crawl_id | string | Required | The ID of the crawl job |
Requirements
Output
json— Crawl data informationFirecrawl.GetCrawlStatus
Get the status of a Firecrawl 'crawl' that is either in progress or recently completed.
Parameters
| Parameter | Type | Req. | Description |
|---|---|---|---|
crawl_id | string | Required | The ID of the crawl job |
Requirements
Output
json— Crawl status informationFirecrawl.MapWebsite
Map a website from a single URL to a map of the entire website.
Parameters
| Parameter | Type | Req. | Description |
|---|---|---|---|
url | string | Required | The base URL to start crawling from |
search | string | Optional | Search query to use for mapping |
ignore_sitemap | boolean | Optional | Ignore the website sitemap when crawling |
include_subdomains | boolean | Optional | Include subdomains of the website |
limit | integer | Optional | Maximum number of links to return |
Requirements
Output
json— Website map dataFirecrawl.ScrapeUrl
Scrape a URL using Firecrawl and return the data in specified formats.
Parameters
| Parameter | Type | Req. | Description |
|---|---|---|---|
url | string | Required | URL to scrape |
formats | array<string> | Optional | Formats to retrieve. Defaults to ['markdown'].markdownhtmlrawHtmllinksscreenshotscreenshot@fullPage |
only_main_content | boolean | Optional | Only return the main content of the page excluding headers, navs, footers, etc. |
include_tags | array<string> | Optional | List of tags to include in the output |
exclude_tags | array<string> | Optional | List of tags to exclude from the output |
wait_for | integer | Optional | Specify a delay in milliseconds before fetching the content, allowing the page sufficient time to load. |
timeout | integer | Optional | Timeout in milliseconds for the request |
Requirements
Output
json— Scraped data in specified formats