No description
Find a file
2025-12-23 13:17:30 +08:00
docs Initial commmit 2025-12-22 17:14:46 +08:00
infra feat(phase4): Implement Deployment & Optimization Layer 2025-12-23 13:09:27 +08:00
skills Initial commmit 2025-12-22 17:14:46 +08:00
src feat(phase4): Implement Deployment & Optimization Layer 2025-12-23 13:09:27 +08:00
tests Updated implementation_plan.md and walkthrough.md for approval before kicking off Phase4 2025-12-23 12:18:11 +08:00
venv feat: complete Phase 2 core components (Camoufox & CurlClient) 2025-12-22 18:01:15 +08:00
docker-compose.yml feat(phase4): Implement Deployment & Optimization Layer 2025-12-23 13:09:27 +08:00
Dockerfile Initial commmit 2025-12-22 17:14:46 +08:00
implementation_plan.md docs: Finalize documentation for v1.0 Release 2025-12-23 13:17:30 +08:00
README.md feat(phase4): Implement Deployment & Optimization Layer 2025-12-23 13:09:27 +08:00
requirements.txt feat: complete Phase 2 core components (Camoufox & CurlClient) 2025-12-22 18:01:15 +08:00
walkthrough.md docs: Finalize documentation for v1.0 Release 2025-12-23 13:17:30 +08:00

FAEA: High-Fidelity Autonomous Extraction Agent

Overview

FAEA is a hybrid extraction system designed to defeat advanced bot mitigation (Cloudflare, Akamai, etc.) using a "Headless-Plus" architecture. It combines full-browser fidelity (Camoufox/Playwright) for authentication with high-speed clients (curl_cffi) for data extraction.

Features

  • Bifurcated Execution: Browser for Auth, Curl for Extraction.
  • TLS Fingerprint Alignment: Browser and Extractor both mimic Chrome/124.
  • Evasion:
    • GhostCursor: Human-like mouse movements (Bezier curves, Fitts's Law).
    • EntropyScheduler: Jittered request timing (Gaussian + Phase Drift).
    • Mobile Proxy Rotation: Sticky session management.
  • Production Ready:
    • Docker Swarm/Compose scaling.
    • Redis-backed persistent task queues.
    • Prometheus/Grafana monitoring.

Getting Started

Prerequisites

  • Docker & Docker Compose
  • Redis (optional, included in compose)

Quick Start (Dev)

docker-compose up --build

Production Usage

1. Scaling the Cluster

The infrastructure is designed to scale horizontally.

# Scale to 5 Browsers and 20 Extractors
docker-compose up -d --scale camoufox-pool=5 --scale curl-pool=20

2. Monitoring

Access the dashboards:

  • Grafana: http://localhost:3000 (Default creds: admin/admin)
  • Prometheus: http://localhost:9090
  • Metrics: Authentication Success Rate, Session Duration, Extraction Throughput.

3. Task Dispatch configuration

Tasks are dispatched via Redis task_queue list. Payload format:

{
  "type": "auth",
  "url": "https://example.com/login",
  "session_id": "sess_123"
}

Architecture

  • src/browser/: Camoufox (Firefox/Chrome) manager for auth.
  • src/extractor/: Curl Client for high-speed extraction.
  • src/core/: Shared logic (Session, Scheduler, Recovery).
  • src/orchestrator/: Worker loops and task management.

Testing

Run unit tests:

./venv/bin/pytest tests/unit/