No description
| docs | ||
| infra | ||
| skills | ||
| src | ||
| tests | ||
| venv | ||
| docker-compose.yml | ||
| Dockerfile | ||
| implementation_plan.md | ||
| README.md | ||
| requirements.txt | ||
| walkthrough.md | ||
FAEA: High-Fidelity Autonomous Extraction Agent
Overview
FAEA is a hybrid extraction system designed to defeat advanced bot mitigation (Cloudflare, Akamai, etc.) using a "Headless-Plus" architecture. It combines full-browser fidelity (Camoufox/Playwright) for authentication with high-speed clients (curl_cffi) for data extraction.
Features
- Bifurcated Execution: Browser for Auth, Curl for Extraction.
- TLS Fingerprint Alignment: Browser and Extractor both mimic
Chrome/124. - Evasion:
- GhostCursor: Human-like mouse movements (Bezier curves, Fitts's Law).
- EntropyScheduler: Jittered request timing (Gaussian + Phase Drift).
- Mobile Proxy Rotation: Sticky session management.
- Production Ready:
- Docker Swarm/Compose scaling.
- Redis-backed persistent task queues.
- Prometheus/Grafana monitoring.
Getting Started
Prerequisites
- Docker & Docker Compose
- Redis (optional, included in compose)
Quick Start (Dev)
docker-compose up --build
Production Usage
1. Scaling the Cluster
The infrastructure is designed to scale horizontally.
# Scale to 5 Browsers and 20 Extractors
docker-compose up -d --scale camoufox-pool=5 --scale curl-pool=20
2. Monitoring
Access the dashboards:
- Grafana:
http://localhost:3000(Default creds: admin/admin) - Prometheus:
http://localhost:9090 - Metrics: Authentication Success Rate, Session Duration, Extraction Throughput.
3. Task Dispatch configuration
Tasks are dispatched via Redis task_queue list.
Payload format:
{
"type": "auth",
"url": "https://example.com/login",
"session_id": "sess_123"
}
Architecture
src/browser/: Camoufox (Firefox/Chrome) manager for auth.src/extractor/: Curl Client for high-speed extraction.src/core/: Shared logic (Session, Scheduler, Recovery).src/orchestrator/: Worker loops and task management.
Testing
Run unit tests:
./venv/bin/pytest tests/unit/