58 lines
2.6 KiB
Markdown
58 lines
2.6 KiB
Markdown
# Phase 4: Deployment & Optimization Implementation Plan
|
|
|
|
## Goal Description
|
|
Transition the system from a functional prototype to a scalable, production-ready extraction grid. This involves:
|
|
1. **Scaling**: Configuring Docker Compose for high concurrency (5 Browsers, 20 Extractors).
|
|
2. **Resilience**: Implementing persistent task queues and auto-recovery logic.
|
|
3. **Observability**: Integrating Prometheus metrics for monitoring health and success rates.
|
|
|
|
## User Review Required
|
|
> [!NOTE]
|
|
> **Monitoring**: We will add `prometheus` and `grafana` containers to `docker-compose.yml` to support the metrics collected by `src/core/monitoring.py`.
|
|
> **Task Loops**: We will introduce a new entry point `src/orchestrator/worker.py` to act as the persistent long-running process consuming from Redis.
|
|
|
|
## Proposed Changes
|
|
|
|
### Infrastructure
|
|
#### [UPDATE] [docker-compose.yml](file:///home/kasm-user/workspace/FAEA/docker-compose.yml)
|
|
- **Services**:
|
|
- `camoufox`: Scale to 5 replicas. Set `shm_size: 2gb`. Limit CPU/Mem.
|
|
- `extractor`: Scale to 20 replicas. Limit resources.
|
|
- `prometheus`: Add service for metrics collection.
|
|
- `grafana`: Add service for visualization.
|
|
- `redis`: Optimize config.
|
|
|
|
### Core Tier (Orchestration & Monitoring)
|
|
#### [NEW] [src/core/monitoring.py](file:///home/kasm-user/workspace/FAEA/src/core/monitoring.py)
|
|
- **Class**: `MetricsCollector`
|
|
- **Metrics**:
|
|
- `auth_attempts` (Counter)
|
|
- `session_duration` (Histogram)
|
|
- `extraction_throughput` (Counter)
|
|
|
|
#### [NEW] [src/orchestrator/worker.py](file:///home/kasm-user/workspace/FAEA/src/orchestrator/worker.py)
|
|
- **Class**: `TaskWorker`
|
|
- **Features**:
|
|
- Infinite loop consuming tasks from Redis lists (`BLPOP`).
|
|
- Dispatch logic: `auth` -> `CamoufoxManager`, `extract` -> `CurlClient`.
|
|
- Integration with `SessionRecoveryManager` for handling failures.
|
|
|
|
#### [NEW] [src/core/recovery.py](file:///home/kasm-user/workspace/FAEA/src/core/recovery.py)
|
|
- **Class**: `SessionRecoveryManager`
|
|
- **Features**:
|
|
- Handle `cf_clearance_expired`, `ip_reputation_drop`, etc.
|
|
|
|
### Documentation
|
|
#### [UPDATE] [README.md](file:///home/kasm-user/workspace/FAEA/README.md)
|
|
- Add "Production Usage" section.
|
|
- Document how to scale and monitor.
|
|
|
|
## Verification Plan
|
|
|
|
### Automated Tests
|
|
- **Integration**: Verify Worker picks up task from Redis.
|
|
- **Metrics**: Verify `/metrics` endpoint is exposed and scraping.
|
|
|
|
### Manual Verification
|
|
- `docker-compose up --scale camoufox=5 --scale extractor=20` to verify stability.
|
|
- Check Grafana dashboard for metric data flow.
|