Skip to content

jackmcpickle/url-reader

Repository files navigation

URL Reader

URL proxy service on Cloudflare Workers. Masquerades as Google Bot to bypass blockers and renders pages as readable text.

Features

  • Bot Masquerading: Uses Google Bot user-agent to bypass simple blockers
  • Browser Fallback: Auto-falls back to Cloudflare Browser Rendering (puppeteer) when blocked (403) or redirected to paywall/cookie pages
  • Content Stripping: Removes tracking scripts, analytics, cookie banners
  • Reader Styles: Injects clean reading styles
  • Caching: Configurable response caching

Usage

GET /proxy?url=<encoded_url>

Parameters

Param Description
url URL to fetch (required, URL-encoded)
cache_ttl Cache duration in seconds (default: 3600)
nocache=1 Bypass cache, fetch fresh
purge=1 Delete cached entry and refetch
header_* Custom headers (e.g. header_Accept=text/html)

Authentication

Basic Auth required via Authorization header.

Example

curl -u user:pass "https://<your-worker>.workers.dev/proxy?url=https://example.com"

How It Works

  1. Request comes in with target URL
  2. Regular fetch with bot user-agent headers
  3. If 403 or paywall redirect detected -> browser fallback
  4. Content stripped of tracking/ads
  5. Reader styles injected
  6. Response cached and returned

Browser Fallback

Triggered automatically when:

  • Response status is 403 (blocked)
  • Redirect to paywall patterns (/nocookies, /subscribe, /login, etc.)

Uses Cloudflare Browser Rendering to execute JavaScript and bypass client-side blocks.

Development

pnpm install
pnpm dev

Deploy

pnpm deploy

Regenerate Types

pnpm cf-typegen

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •