FAEA/implementation_plan.md

2.6 KiB

Phase 4: Deployment & Optimization Implementation Plan

Goal Description

Transition the system from a functional prototype to a scalable, production-ready extraction grid. This involves:

  1. Scaling: Configuring Docker Compose for high concurrency (5 Browsers, 20 Extractors).
  2. Resilience: Implementing persistent task queues and auto-recovery logic.
  3. Observability: Integrating Prometheus metrics for monitoring health and success rates.

User Review Required

Note

Monitoring: We will add prometheus and grafana containers to docker-compose.yml to support the metrics collected by src/core/monitoring.py. Task Loops: We will introduce a new entry point src/orchestrator/worker.py to act as the persistent long-running process consuming from Redis.

Proposed Changes

Infrastructure

[UPDATE] docker-compose.yml

  • Services:
    • camoufox: Scale to 5 replicas. Set shm_size: 2gb. Limit CPU/Mem.
    • extractor: Scale to 20 replicas. Limit resources.
    • prometheus: Add service for metrics collection.
    • grafana: Add service for visualization.
    • redis: Optimize config.

Core Tier (Orchestration & Monitoring)

[NEW] src/core/monitoring.py

  • Class: MetricsCollector
  • Metrics:
    • auth_attempts (Counter)
    • session_duration (Histogram)
    • extraction_throughput (Counter)

[NEW] src/orchestrator/worker.py

  • Class: TaskWorker
  • Features:
    • Infinite loop consuming tasks from Redis lists (BLPOP).
    • Dispatch logic: auth -> CamoufoxManager, extract -> CurlClient.
    • Integration with SessionRecoveryManager for handling failures.

[NEW] src/core/recovery.py

  • Class: SessionRecoveryManager
  • Features:
    • Handle cf_clearance_expired, ip_reputation_drop, etc.

Documentation

[UPDATE] README.md

  • Add "Production Usage" section.
  • Document how to scale and monitor.

Verification Plan

Automated Tests

  • Integration: Verify Worker picks up task from Redis.
  • Metrics: Verify /metrics endpoint is exposed and scraping.

Manual Verification

  • docker-compose up --scale camoufox=5 --scale extractor=20 to verify stability.
  • Check Grafana dashboard for metric data flow.