brightdata

@danielmiessler/brightdata

danielmiessler

5,085

813 forks

Updated 1/18/2026

View on GitHub

BrightData: Progressive URL scraping with four-tier fallback - WebFetch, Curl, Browser Automation, Bright Data MCP. USE WHEN scrape URL, fetch URL, web scraping, bot detection, can't access site.

Installation

$skills install @danielmiessler/brightdata

Claude Code

Cursor

Copilot

Codex

Antigravity

Details

Repositorydanielmiessler/Personal_AI_Infrastructure

PathPacks/pai-brightdata-skill/src/skills/BrightData/SKILL.md

Branchmain

Scoped Name@danielmiessler/brightdata

Usage

After installing, this skill will be available to your AI coding assistant.

Verify installation:

skills list

Skill Instructions

name: BrightData version: 2.3.0 description: Progressive URL scraping with four-tier fallback - WebFetch, Curl, Browser Automation, Bright Data MCP. USE WHEN scrape URL, fetch URL, web scraping, bot detection, can't access site.

BrightData - Progressive URL Scraping

Intelligent URL content retrieval with automatic fallback strategy.

Activation Triggers

Direct Scraping Requests

"scrape this URL", "scrape [URL]", "scrape this page"
"fetch this URL", "fetch [URL]", "fetch this page"
"pull content from [URL]", "get content from [URL]"
"retrieve [URL]", "download this page content"

Access Issues

"can't access this site", "site is blocking me"
"bot detection", "CAPTCHA", "403 error"
"this URL won't load"

Explicit Tier Requests

"use Bright Data to fetch [URL]" - Skip to Tier 4
"use browser to scrape [URL]" - Skip to Tier 3

Core Capability

Four-Tier Progressive Escalation:

START
  |
Tier 1 (WebFetch) --> Success? --> Return content
  |
  No
  |
Tier 2 (Curl) --> Success? --> Return content
  |
  No
  |
Tier 3 (Browser) --> Success? --> Return content
  |
  No
  |
Tier 4 (Bright Data) --> Success? --> Return content
  |
  No
  |
Report failure + alternatives

Tier Details

Tier 1: WebFetch (Built-in)

Tool: Claude Code's WebFetch
Speed: ~2-5 seconds
Cost: Free
Works for: Public sites, no bot detection

Tier 2: Curl with Chrome Headers

Tool: Bash curl with comprehensive browser headers
Speed: ~3-7 seconds
Cost: Free
Works for: Sites with basic user-agent filtering

Tier 3: Browser Automation (Playwright)

Tool: Browser skill's Playwright automation
Speed: ~10-20 seconds
Cost: Free
Works for: JavaScript SPAs, dynamic content

Tier 4: Bright Data MCP

Tool: mcp__Brightdata__scrape_as_markdown
Speed: ~5-15 seconds
Cost: Bright Data credits
Works for: CAPTCHA, advanced bot detection, residential IPs

Workflow Routing

Default workflow for all URL scraping:

Route to: Workflows/FourTierScrape.md
Output: URL content in markdown format

Examples

Example 1: Simple Site (Tier 1 Success)

User: "Scrape https://example.com"

Process:

Attempt Tier 1 (WebFetch)
Success in 3 seconds
Return markdown content

Example 2: JavaScript Site (Tier 3 Success)

User: "Fetch https://spa-app.com"

Process:

Tier 1 fails (blocked)
Tier 2 fails (JavaScript required)
Tier 3 succeeds with Playwright
Return markdown content

Example 3: Protected Site (Tier 4 Success)

User: "Can't access https://protected-site.com"

Process:

Tier 1 fails (403)
Tier 2 fails (bot detection)
Tier 3 fails (CAPTCHA)
Tier 4 succeeds with Bright Data
Return markdown content

Example 4: Direct Tier Request

User: "Use Bright Data to fetch https://any-site.com"

Process:

User explicitly requested Bright Data
Skip directly to Tier 4
Return markdown content

Integration Points

WebFetch Tool - Built-in Claude Code tool
Bash Tool - For curl commands
Browser Skill - For Playwright automation (requires pai-browser-skill)
Bright Data MCP - For professional scraping

More by danielmiessler

View all

core

5,085

CORE: Personal AI Infrastructure core. AUTO-LOADS at session start. The authoritative reference for how the PAI system works, how to use it, and all system-level configuration. USE WHEN any session begins, user asks about the system, identity, configuration, workflows, security, or any other question about how the PAI system operates.

agents

5,085

Agents: Dynamic agent composition and management system. USE WHEN user says create custom agents, spin up custom agents, specialized agents, OR asks for agent personalities, available traits, agent voices. Handles custom agent creation, personality assignment, voice mapping, and parallel agent orchestration.

thealgorithm

5,085

THEALGORITHM: Universal execution engine using scientific method to achieve ideal state. USE WHEN complex tasks, multi-step work, "run the algorithm", "use the algorithm", OR any non-trivial request that benefits from structured execution with ISC (Ideal State Criteria) tracking.

prompting

5,085

Prompting: Meta-prompting system for dynamic prompt generation using templates, standards, and patterns. USE WHEN meta-prompting, template generation, prompt optimization, or programmatic prompt composition.

brightdata

Installation

Details

Usage

Skill Instructions

name: BrightData version: 2.3.0 description: Progressive URL scraping with four-tier fallback - WebFetch, Curl, Browser Automation, Bright Data MCP. USE WHEN scrape URL, fetch URL, web scraping, bot detection, can't access site.

BrightData - Progressive URL Scraping

Activation Triggers

Direct Scraping Requests

Access Issues

Explicit Tier Requests

Core Capability

Tier Details

Tier 1: WebFetch (Built-in)

Tier 2: Curl with Chrome Headers

Tier 3: Browser Automation (Playwright)

Tier 4: Bright Data MCP

Workflow Routing

Examples

Example 1: Simple Site (Tier 1 Success)

Example 2: JavaScript Site (Tier 3 Success)

Example 3: Protected Site (Tier 4 Success)

Example 4: Direct Tier Request

Integration Points

Related Documentation

More by danielmiessler