Firecrawl 网页抓取

Name: OpenClaw
Availability: Free
Author: OpenClaw中文社区

Firecrawl 工具用于智能网页抓取。

功能特性

智能内容提取
链接发现
截图功能
JavaScript 渲染
Sitemap 支持

配置

yaml

tools:
  firecrawl:
    enabled: true
    api_key: "${FIRECRAWL_API_KEY}"
    url: "https://api.firecrawl.dev"
    timeout: 30000
    options:
      only_main_content: true
      screenshot: true
      javascript: true
      cache: true

使用示例

typescript

// 抓取页面内容
const result = await tool.firecrawl.scrape({
  url: 'https://example.com/article',
  onlyMainContent: true
});

console.log('标题:', result.title);
console.log('内容:', result.content);

// 抓取并截图
const screenshot = await tool.firecrawl.screenshot({
  url: 'https://example.com',
  format: 'png',
  fullPage: false
});

// 发现链接
const links = await tool.firecrawl.discover({
  url: 'https://example.com',
  depth: 2
});

输出格式

typescript

interface ScrapeResult {
  success: boolean;
  title: string;
  description: string;
  keywords: string[];
  language: string;
  content: string;       // Markdown 格式
  links: Link[];
  screenshot?: string;   // Base64 图片
  metadata: {
    pageSize: number;
    crawlTime: number;
    jsEnabled: boolean;
  };
}

批量抓取

typescript

// 抓取多个 URL
const results = await tool.firecrawl.batchScrape({
  urls: [
    'https://example.com/page1',
    'https://example.com/page2',
    'https://example.com/page3'
  ],
  options: {
    onlyMainContent: true
  }
});

Firecrawl 网页抓取 ​

功能特性 ​

配置 ​

使用示例 ​

输出格式 ​

批量抓取 ​

相关文档 ​

Firecrawl 网页抓取

功能特性

配置

使用示例

输出格式

批量抓取

相关文档