45 lines
2.4 KiB
Markdown
45 lines
2.4 KiB
Markdown
# Phase 2: Core Components (Headless-Plus) Implementation Plan
|
|
|
|
## Goal Description
|
|
Implement the core logic for the "Headless-Plus" architecture:
|
|
1. **Browser Tier**: `CamoufoxManager` to handle browser instantiation, profile injection, and state extraction.
|
|
2. **Extractor Tier**: `CurlCffiClient` to consume shared state and execute high-speed requests with matching fingerprints.
|
|
|
|
## User Review Required
|
|
> [!IMPORTANT]
|
|
> **Mocking Strategy**: Since we might not have a live "Cloudflare-protected" target easily accessible for automated testing, I will implement a **Mock Target** using a local `http.server` or `FastAPI` that logs headers/TLS info to verify fingerprints.
|
|
|
|
## Proposed Changes
|
|
|
|
### Browser Tier
|
|
#### [NEW] [src/browser/manager.py](file:///home/kasm-user/workspace/FAEA/src/browser/manager.py)
|
|
- **Class**: `CamoufoxManager`
|
|
- **Responsibilities**:
|
|
- Launch Camoufox (via Playwright) with specific `user_agent` and `viewport`.
|
|
- `initialize()`: Set up browser context.
|
|
- `extract_session_state()`: Gather cookies, storage, and fingerprint info into `SessionState`.
|
|
- **Safety**: Implement `__aenter__` and `__aexit__` for aggressively reclaiming memory (close context/page).
|
|
|
|
### Extractor Tier
|
|
#### [NEW] [src/extractor/client.py](file:///home/kasm-user/workspace/FAEA/src/extractor/client.py)
|
|
- **Class**: `CurlClient`
|
|
- **Responsibilities**:
|
|
- Initialize with `SessionState`.
|
|
- Configure `curl_cffi` session to match `SessionState.tls_fingerprint`.
|
|
- `fetch(url)`: Execute requests using the shared state.
|
|
|
|
### Testing Infrastructure
|
|
#### [NEW] [tests/e2e/test_handover.py](file:///home/kasm-user/workspace/FAEA/tests/e2e/test_handover.py)
|
|
- **TLS Verification**: The automated test will likely use a local mock for Header/Cookie verification.
|
|
- **Manual JA3 Verification**: A separate script `tests/manual/verify_tls.py` will be created to hit an external service (e.g., `https://tls.peet.ws/api/all`) to print and compare JA3 hashes from both Camoufox and CurlClient. This addresses the "High Risk" feedback by acknowledging external dependency for true TLS verification.
|
|
|
|
## Verification Plan
|
|
|
|
### Automated Tests
|
|
1. **Mock Server Test**:
|
|
- Start a local server that captures headers.
|
|
- Run the E2E script.
|
|
- Assert that both Browser and Client requests look identical (or sufficiently similar).
|
|
|
|
### Manual Verification
|
|
- Run `docker-compose up` and execute a manual script inside the orchestrator container to trigger the flow.
|