diff --git a/implementation_plan.md b/implementation_plan.md index 3ea6240..934f24c 100644 --- a/implementation_plan.md +++ b/implementation_plan.md @@ -1,29 +1,58 @@ -# Phase 3: Evasion & Resilience Implementation Plan (COMPLETED) +# Phase 4: Deployment & Optimization Implementation Plan ## Goal Description -Implement the "Human" behavior layer to defeat behavioral biometrics and temporal analysis. +Transition the system from a functional prototype to a scalable, production-ready extraction grid. This involves: +1. **Scaling**: Configuring Docker Compose for high concurrency (5 Browsers, 20 Extractors). +2. **Resilience**: Implementing persistent task queues and auto-recovery logic. +3. **Observability**: Integrating Prometheus metrics for monitoring health and success rates. -## Completed Changes +## User Review Required +> [!NOTE] +> **Monitoring**: We will add `prometheus` and `grafana` containers to `docker-compose.yml` to support the metrics collected by `src/core/monitoring.py`. +> **Task Loops**: We will introduce a new entry point `src/orchestrator/worker.py` to act as the persistent long-running process consuming from Redis. -### Browser Tier (Human Mimesis) -- **GhostCursorEngine** (`src/browser/ghost_cursor.py`): - - Implemented composite cubic Bezier curves. - - Implemented Fitts's Law velocity profiles. - - Added random micro-movements for human drift simulation. +## Proposed Changes -### Core Tier (Temporal & Network Entropy) -- **EntropyScheduler** (`src/core/scheduler.py`): - - Implemented Gaussian noise injection ($\sigma=5.0$). - - Implemented Phase shift rotation to prevent harmonic detection. -- **MobileProxyRotator** (`src/core/proxy.py`): - - Implemented Sticky Session logic. - - Implemented Cooldown management. +### Infrastructure +#### [UPDATE] [docker-compose.yml](file:///home/kasm-user/workspace/FAEA/docker-compose.yml) +- **Services**: + - `camoufox`: Scale to 5 replicas. Set `shm_size: 2gb`. Limit CPU/Mem. + - `extractor`: Scale to 20 replicas. Limit resources. + - `prometheus`: Add service for metrics collection. + - `grafana`: Add service for visualization. + - `redis`: Optimize config. -### Remediation: TLS Fingerprint Alignment -- **Tuned** `src/browser/manager.py`: Updated to trigger `Chrome/124`. -- **Tuned** `src/extractor/client.py`: Updated to use `chrome124` impersonation verify consistency. -- **Verified**: Static alignment achieved. Dynamic verification (`tests/manual/verify_tls.py`) confirms logic but faced prompt-specific network blocks. +### Core Tier (Orchestration & Monitoring) +#### [NEW] [src/core/monitoring.py](file:///home/kasm-user/workspace/FAEA/src/core/monitoring.py) +- **Class**: `MetricsCollector` +- **Metrics**: + - `auth_attempts` (Counter) + - `session_duration` (Histogram) + - `extraction_throughput` (Counter) -## Verification Status -- **Functional**: Components implemented and unit-testable. -- **TLS**: Aligned to Chrome 124 standard. +#### [NEW] [src/orchestrator/worker.py](file:///home/kasm-user/workspace/FAEA/src/orchestrator/worker.py) +- **Class**: `TaskWorker` +- **Features**: + - Infinite loop consuming tasks from Redis lists (`BLPOP`). + - Dispatch logic: `auth` -> `CamoufoxManager`, `extract` -> `CurlClient`. + - Integration with `SessionRecoveryManager` for handling failures. + +#### [NEW] [src/core/recovery.py](file:///home/kasm-user/workspace/FAEA/src/core/recovery.py) +- **Class**: `SessionRecoveryManager` +- **Features**: + - Handle `cf_clearance_expired`, `ip_reputation_drop`, etc. + +### Documentation +#### [UPDATE] [README.md](file:///home/kasm-user/workspace/FAEA/README.md) +- Add "Production Usage" section. +- Document how to scale and monitor. + +## Verification Plan + +### Automated Tests +- **Integration**: Verify Worker picks up task from Redis. +- **Metrics**: Verify `/metrics` endpoint is exposed and scraping. + +### Manual Verification +- `docker-compose up --scale camoufox=5 --scale extractor=20` to verify stability. +- Check Grafana dashboard for metric data flow. diff --git a/src/browser/__pycache__/ghost_cursor.cpython-310.pyc b/src/browser/__pycache__/ghost_cursor.cpython-310.pyc new file mode 100644 index 0000000..76e7221 Binary files /dev/null and b/src/browser/__pycache__/ghost_cursor.cpython-310.pyc differ diff --git a/src/extractor/client.py b/src/extractor/client.py index d4cee4a..63f0121 100644 --- a/src/extractor/client.py +++ b/src/extractor/client.py @@ -29,7 +29,8 @@ class CurlClient: logger.info("Initializing CurlClient...") # impersonate argument controls TLS Client Hello - # 'chrome120' matches our hardcoded Camoufox build in this MVP + # impersonate argument controls TLS Client Hello + # 'chrome124' matches our hardcoded Camoufox build in this MVP self.session = AsyncSession(impersonate=self.session_state.tls_fingerprint) # 1. Inject Cookies diff --git a/tests/unit/__pycache__/test_ghost_cursor.cpython-310-pytest-9.0.2.pyc b/tests/unit/__pycache__/test_ghost_cursor.cpython-310-pytest-9.0.2.pyc new file mode 100644 index 0000000..389069f Binary files /dev/null and b/tests/unit/__pycache__/test_ghost_cursor.cpython-310-pytest-9.0.2.pyc differ diff --git a/tests/unit/test_ghost_cursor.py b/tests/unit/test_ghost_cursor.py new file mode 100644 index 0000000..cae0fa7 --- /dev/null +++ b/tests/unit/test_ghost_cursor.py @@ -0,0 +1,59 @@ +import pytest +import math +from src.browser.ghost_cursor import GhostCursorEngine + +def test_bezier_curve_generation(): + engine = GhostCursorEngine() + start = (0, 0) + end = (100, 100) + + # Test control point generation + c1, c2 = engine._generate_bezier_controls(start, end) + + # Basic bounds check: Control points should be somewhat between start and end + # but can overshoot. + # Just ensure they are tuples of floats + assert isinstance(c1, tuple) + assert len(c1) == 2 + assert isinstance(c2, tuple) + assert len(c2) == 2 + +def test_bezier_point_calculation(): + engine = GhostCursorEngine() + p0 = (0,0) + p1 = (10, 20) + p2 = (80, 90) + p3 = (100, 100) + + # t=0 should be start + res_0 = engine._bezier_point(0, p0, p1, p2, p3) + assert math.isclose(res_0[0], 0) + assert math.isclose(res_0[1], 0) + + # t=1 should be end + res_1 = engine._bezier_point(1, p0, p1, p2, p3) + assert math.isclose(res_1[0], 100) + assert math.isclose(res_1[1], 100) + + # t=0.5 should be somewhere in between + res_mid = engine._bezier_point(0.5, p0, p1, p2, p3) + assert 0 < res_mid[0] < 100 + assert 0 < res_mid[1] < 100 + +def test_waypoints_generation(): + engine = GhostCursorEngine() + start = (0, 0) + end = (300, 300) + count = 3 + + waypoints = engine._generate_waypoints(start, end, count) + + assert len(waypoints) == count + 1 # +1 for the end point + assert waypoints[0] == start + assert waypoints[-1] == end + + # Check intermediate points exist + for i in range(1, count): + assert waypoints[i] != start + assert waypoints[i] != end + diff --git a/walkthrough.md b/walkthrough.md index 22b4b49..d3befc5 100644 --- a/walkthrough.md +++ b/walkthrough.md @@ -95,8 +95,17 @@ tests/unit/test_session_core.py .. [100%] - **EntropyScheduler**: Implemented (`src/core/scheduler.py`). - **MobileProxyRotator**: Implemented (`src/core/proxy.py`). -## 4. Next Steps (Phase 4: Deployment & Optimization) -- Tune Bezier parameters against live detection. -- Implement persistent Redis task queues. -- Scale Proxy Rotator for high concurrency. +## Phase 4: Deployment & Optimization Walkthrough (Planned) + +### 1. Goals +- Scale infrastructure (5x Browser, 20x Extractor). +- Implement persistent task workers with Redis. +- Implement Monitoring (Prometheus/Grafana). +- Implement auto-recovery logic. + +### 2. Next Steps +- Update `docker-compose.yml`. +- Implement `src/orchestrator/worker.py`. +- Implement `src/core/monitoring.py`. +