2.6 KiB
2.6 KiB
Phase 4: Deployment & Optimization Implementation Plan
Goal Description
Transition the system from a functional prototype to a scalable, production-ready extraction grid. This involves:
- Scaling: Configuring Docker Compose for high concurrency (5 Browsers, 20 Extractors).
- Resilience: Implementing persistent task queues and auto-recovery logic.
- Observability: Integrating Prometheus metrics for monitoring health and success rates.
User Review Required
Note
Monitoring: We will add
prometheusandgrafanacontainers todocker-compose.ymlto support the metrics collected bysrc/core/monitoring.py. Task Loops: We will introduce a new entry pointsrc/orchestrator/worker.pyto act as the persistent long-running process consuming from Redis.
Proposed Changes
Infrastructure
[UPDATE] docker-compose.yml
- Services:
camoufox: Scale to 5 replicas. Setshm_size: 2gb. Limit CPU/Mem.extractor: Scale to 20 replicas. Limit resources.prometheus: Add service for metrics collection.grafana: Add service for visualization.redis: Optimize config.
Core Tier (Orchestration & Monitoring)
[NEW] src/core/monitoring.py
- Class:
MetricsCollector - Metrics:
auth_attempts(Counter)session_duration(Histogram)extraction_throughput(Counter)
[NEW] src/orchestrator/worker.py
- Class:
TaskWorker - Features:
- Infinite loop consuming tasks from Redis lists (
BLPOP). - Dispatch logic:
auth->CamoufoxManager,extract->CurlClient. - Integration with
SessionRecoveryManagerfor handling failures.
- Infinite loop consuming tasks from Redis lists (
[NEW] src/core/recovery.py
- Class:
SessionRecoveryManager - Features:
- Handle
cf_clearance_expired,ip_reputation_drop, etc.
- Handle
Documentation
[UPDATE] README.md
- Add "Production Usage" section.
- Document how to scale and monitor.
Verification Plan
Automated Tests
- Integration: Verify Worker picks up task from Redis.
- Metrics: Verify
/metricsendpoint is exposed and scraping.
Manual Verification
docker-compose up --scale camoufox=5 --scale extractor=20to verify stability.- Check Grafana dashboard for metric data flow.