All projects

Scrappy

Bulk job-board scraper with 100+ sites, email enrichment, deterministic quality scoring, and multi-format exports.

TypeScriptPythonWeb ScrapingEmail EnrichmentData EngineeringParquet
Timeline: 2025 - PresentRole: CreatorTeam: SoloStatus: in-progress

Overview

Scrappy is a high-throughput job-board scraper that covers 100+ sites with deterministic quality scoring, email enrichment, and multi-format exports. Built for bulk-first scheduled operations with per-site rate limiting, proxy pools, and resume support.

Key Features

100+ Job Boards

Scrapes listings from over 100 job boards simultaneously with per-site rate limiting and proxy rotation.

Email Enrichment

Automatically enriches job listings with recruiter contact information using multiple enrichment strategies.

Quality Scoring

Deterministic quality scoring algorithm that ranks listings by relevance, freshness, and completeness.

Multi-Format Export

Exports results to CSV, JSONL, XLSX, and Parquet formats — ready for analysis or pipeline consumption.

Resume Support

Scrappy can resume interrupted scraping sessions, so long-running jobs survive connection drops.

Architecture

Scrappy uses a modular architecture with:

Tech Stack