Updated implementation_plan.md and walkthrough.md for approval before kicking off Phase4

This commit is contained in:
Luciabrightcode 2025-12-23 12:18:11 +08:00
parent 4f76105e3d
commit 2ff4d593f8
6 changed files with 125 additions and 27 deletions

View file

@ -1,29 +1,58 @@
# Phase 3: Evasion & Resilience Implementation Plan (COMPLETED)
# Phase 4: Deployment & Optimization Implementation Plan
## Goal Description
Implement the "Human" behavior layer to defeat behavioral biometrics and temporal analysis.
Transition the system from a functional prototype to a scalable, production-ready extraction grid. This involves:
1. **Scaling**: Configuring Docker Compose for high concurrency (5 Browsers, 20 Extractors).
2. **Resilience**: Implementing persistent task queues and auto-recovery logic.
3. **Observability**: Integrating Prometheus metrics for monitoring health and success rates.
## Completed Changes
## User Review Required
> [!NOTE]
> **Monitoring**: We will add `prometheus` and `grafana` containers to `docker-compose.yml` to support the metrics collected by `src/core/monitoring.py`.
> **Task Loops**: We will introduce a new entry point `src/orchestrator/worker.py` to act as the persistent long-running process consuming from Redis.
### Browser Tier (Human Mimesis)
- **GhostCursorEngine** (`src/browser/ghost_cursor.py`):
- Implemented composite cubic Bezier curves.
- Implemented Fitts's Law velocity profiles.
- Added random micro-movements for human drift simulation.
## Proposed Changes
### Core Tier (Temporal & Network Entropy)
- **EntropyScheduler** (`src/core/scheduler.py`):
- Implemented Gaussian noise injection ($\sigma=5.0$).
- Implemented Phase shift rotation to prevent harmonic detection.
- **MobileProxyRotator** (`src/core/proxy.py`):
- Implemented Sticky Session logic.
- Implemented Cooldown management.
### Infrastructure
#### [UPDATE] [docker-compose.yml](file:///home/kasm-user/workspace/FAEA/docker-compose.yml)
- **Services**:
- `camoufox`: Scale to 5 replicas. Set `shm_size: 2gb`. Limit CPU/Mem.
- `extractor`: Scale to 20 replicas. Limit resources.
- `prometheus`: Add service for metrics collection.
- `grafana`: Add service for visualization.
- `redis`: Optimize config.
### Remediation: TLS Fingerprint Alignment
- **Tuned** `src/browser/manager.py`: Updated to trigger `Chrome/124`.
- **Tuned** `src/extractor/client.py`: Updated to use `chrome124` impersonation verify consistency.
- **Verified**: Static alignment achieved. Dynamic verification (`tests/manual/verify_tls.py`) confirms logic but faced prompt-specific network blocks.
### Core Tier (Orchestration & Monitoring)
#### [NEW] [src/core/monitoring.py](file:///home/kasm-user/workspace/FAEA/src/core/monitoring.py)
- **Class**: `MetricsCollector`
- **Metrics**:
- `auth_attempts` (Counter)
- `session_duration` (Histogram)
- `extraction_throughput` (Counter)
## Verification Status
- **Functional**: Components implemented and unit-testable.
- **TLS**: Aligned to Chrome 124 standard.
#### [NEW] [src/orchestrator/worker.py](file:///home/kasm-user/workspace/FAEA/src/orchestrator/worker.py)
- **Class**: `TaskWorker`
- **Features**:
- Infinite loop consuming tasks from Redis lists (`BLPOP`).
- Dispatch logic: `auth` -> `CamoufoxManager`, `extract` -> `CurlClient`.
- Integration with `SessionRecoveryManager` for handling failures.
#### [NEW] [src/core/recovery.py](file:///home/kasm-user/workspace/FAEA/src/core/recovery.py)
- **Class**: `SessionRecoveryManager`
- **Features**:
- Handle `cf_clearance_expired`, `ip_reputation_drop`, etc.
### Documentation
#### [UPDATE] [README.md](file:///home/kasm-user/workspace/FAEA/README.md)
- Add "Production Usage" section.
- Document how to scale and monitor.
## Verification Plan
### Automated Tests
- **Integration**: Verify Worker picks up task from Redis.
- **Metrics**: Verify `/metrics` endpoint is exposed and scraping.
### Manual Verification
- `docker-compose up --scale camoufox=5 --scale extractor=20` to verify stability.
- Check Grafana dashboard for metric data flow.

Binary file not shown.

View file

@ -29,7 +29,8 @@ class CurlClient:
logger.info("Initializing CurlClient...")
# impersonate argument controls TLS Client Hello
# 'chrome120' matches our hardcoded Camoufox build in this MVP
# impersonate argument controls TLS Client Hello
# 'chrome124' matches our hardcoded Camoufox build in this MVP
self.session = AsyncSession(impersonate=self.session_state.tls_fingerprint)
# 1. Inject Cookies

View file

@ -0,0 +1,59 @@
import pytest
import math
from src.browser.ghost_cursor import GhostCursorEngine
def test_bezier_curve_generation():
engine = GhostCursorEngine()
start = (0, 0)
end = (100, 100)
# Test control point generation
c1, c2 = engine._generate_bezier_controls(start, end)
# Basic bounds check: Control points should be somewhat between start and end
# but can overshoot.
# Just ensure they are tuples of floats
assert isinstance(c1, tuple)
assert len(c1) == 2
assert isinstance(c2, tuple)
assert len(c2) == 2
def test_bezier_point_calculation():
engine = GhostCursorEngine()
p0 = (0,0)
p1 = (10, 20)
p2 = (80, 90)
p3 = (100, 100)
# t=0 should be start
res_0 = engine._bezier_point(0, p0, p1, p2, p3)
assert math.isclose(res_0[0], 0)
assert math.isclose(res_0[1], 0)
# t=1 should be end
res_1 = engine._bezier_point(1, p0, p1, p2, p3)
assert math.isclose(res_1[0], 100)
assert math.isclose(res_1[1], 100)
# t=0.5 should be somewhere in between
res_mid = engine._bezier_point(0.5, p0, p1, p2, p3)
assert 0 < res_mid[0] < 100
assert 0 < res_mid[1] < 100
def test_waypoints_generation():
engine = GhostCursorEngine()
start = (0, 0)
end = (300, 300)
count = 3
waypoints = engine._generate_waypoints(start, end, count)
assert len(waypoints) == count + 1 # +1 for the end point
assert waypoints[0] == start
assert waypoints[-1] == end
# Check intermediate points exist
for i in range(1, count):
assert waypoints[i] != start
assert waypoints[i] != end

View file

@ -95,8 +95,17 @@ tests/unit/test_session_core.py .. [100%]
- **EntropyScheduler**: Implemented (`src/core/scheduler.py`).
- **MobileProxyRotator**: Implemented (`src/core/proxy.py`).
## 4. Next Steps (Phase 4: Deployment & Optimization)
- Tune Bezier parameters against live detection.
- Implement persistent Redis task queues.
- Scale Proxy Rotator for high concurrency.
## Phase 4: Deployment & Optimization Walkthrough (Planned)
### 1. Goals
- Scale infrastructure (5x Browser, 20x Extractor).
- Implement persistent task workers with Redis.
- Implement Monitoring (Prometheus/Grafana).
- Implement auto-recovery logic.
### 2. Next Steps
- Update `docker-compose.yml`.
- Implement `src/orchestrator/worker.py`.
- Implement `src/core/monitoring.py`.