FAEA/walkthrough.md

111 lines
4.1 KiB
Markdown

# Phase 1: Foundation (Headless-Plus) Walkthrough
## 1. Directory Structure Created
Scaffolded the following structure for FAEA:
```
/home/kasm-user/workspace/FAEA/
├── docker-compose.yml
├── Dockerfile
├── requirements.txt
├── src/
│ ├── core/
│ │ ├── session.py # SessionState Class (Implemented)
│ │ └── handover.py # HandoverValidator (Implemented)
│ ├── browser/
│ │ └── Dockerfile # Camoufox Scaffolding
│ ├── extractor/
│ │ └── Dockerfile # Curl Scaffolding
│ └── infra/
│ └── storage.py # RedisStorage (Implemented)
└── tests/
└── unit/
└── test_session_core.py # Unit Verification
```
## 2. Infrastructure Scaffolding
Created `docker-compose.yml` defining services:
- **Orchestrator**: Python controller.
- **Redis**: Shared state store.
- **Camoufox**: Browser tier.
- **Curl-Extractor**: Network tier.
## 3. Verification Results
### session.msgpack Serialization
Verified that `SessionState` correctly serializes to msgpack with HMAC signature and deserializes back.
### Handover Protocol
Verified `HandoverValidator` logic for:
- User-Agent vs TLS Fingerprint consistency.
- `sec-ch-ua` header derivation from User-Agent.
**Test Output:**
```
tests/unit/test_session_core.py .. [100%]
2 passed in 0.06s
```
## Phase 2: Core Components (Headless-Plus) Walkthrough
### 1. Implementation
- **Browser Tier**: Implemented `CamoufoxManager` in `src/browser/manager.py`.
- Features: `__aenter__`/`__aexit__` for memory safety, session state extraction.
- **Extractor Tier**: Implemented `CurlClient` in `src/extractor/client.py`.
- Features: `chrome120` impersonation, session consumption (Cookies/Headers).
### 2. Verification Results
#### Automated E2E Test (`tests/e2e/test_handover.py`)
- **Status**: PASSED.
- **Scope**: Verified that `CurlClient` successfully consumes `SessionState` extracted from `CamoufoxManager` and matches the User-Agent against a local mock server.
#### Manual TLS Verification (`tests/manual/verify_tls.py`)
- **Status**: FAILED (Expected Risk).
- **Finding**: Detected JA3 mismatch between Camoufox (Chromium) and CurlClient (curl_cffi).
- Camoufox JA3: `9a9695ad9941a88944c373caf9333b57`
- CurlClient JA3: `3b0d0e7fc411345ff1917b0325186e26`
- **Implication**: While Header consistency is achieved, TLS fingerprint identity is not yet perfect. This requires fine-tuning `curl_cffi` impersonation or matching the browser build more closely in Phase 3.
## 5. Next Steps
- Address TLS Mismatch (Phase 3).
- Implement persistent Redis loops.
## Phase 3: Evasion & Resilience Walkthrough
### 1. Goals
- **GhostCursorEngine**: Implement human-like mouse trajectories using Bezier curves and Fitts's Law.
- **EntropyScheduler**: Implement jittered request scheduling with Gaussian noise and phase drift.
- **ProxyRotator**: Implement sticky session management for mobile proxies.
### 3. Verification Results
#### Remediation: TLS Fingerprint Alignment
- **Status**: PARTIAL.
- **Verification**: `tests/manual/verify_tls.py` timed out due to network blocks on the test endpoint.
- **Action Taken**: Updated `CamoufoxManager` to use `Chrome/124` User-Agent and `chrome124` TLS fingerprint target for `CurlClient`. This aligns both tiers to a newer, consistent standard.
#### Human Mimesis Verification
- **Test**: `tests/unit/test_ghost_cursor.py`
- **Status**: PASSED (3/3 tests).
- **Scope**: Verified Bezier curve generation, control point logic, and waypoint interpolation.
#### Implementation Status
- **GhostCursorEngine**: Implemented (`src/browser/ghost_cursor.py`).
- **EntropyScheduler**: Implemented (`src/core/scheduler.py`).
- **MobileProxyRotator**: Implemented (`src/core/proxy.py`).
## Phase 4: Deployment & Optimization Walkthrough (Planned)
### 1. Goals
- Scale infrastructure (5x Browser, 20x Extractor).
- Implement persistent task workers with Redis.
- Implement Monitoring (Prometheus/Grafana).
- Implement auto-recovery logic.
### 2. Next Steps
- Update `docker-compose.yml`.
- Implement `src/orchestrator/worker.py`.
- Implement `src/core/monitoring.py`.