FAEA/implementation_plan.md

2.4 KiB

Phase 2: Core Components (Headless-Plus) Implementation Plan

Goal Description

Implement the core logic for the "Headless-Plus" architecture:

  1. Browser Tier: CamoufoxManager to handle browser instantiation, profile injection, and state extraction.
  2. Extractor Tier: CurlCffiClient to consume shared state and execute high-speed requests with matching fingerprints.

User Review Required

Important

Mocking Strategy: Since we might not have a live "Cloudflare-protected" target easily accessible for automated testing, I will implement a Mock Target using a local http.server or FastAPI that logs headers/TLS info to verify fingerprints.

Proposed Changes

Browser Tier

[NEW] src/browser/manager.py

  • Class: CamoufoxManager
  • Responsibilities:
    • Launch Camoufox (via Playwright) with specific user_agent and viewport.
    • initialize(): Set up browser context.
    • extract_session_state(): Gather cookies, storage, and fingerprint info into SessionState.
    • Safety: Implement __aenter__ and __aexit__ for aggressively reclaiming memory (close context/page).

Extractor Tier

[NEW] src/extractor/client.py

  • Class: CurlClient
  • Responsibilities:
    • Initialize with SessionState.
    • Configure curl_cffi session to match SessionState.tls_fingerprint.
    • fetch(url): Execute requests using the shared state.

Testing Infrastructure

[NEW] tests/e2e/test_handover.py

  • TLS Verification: The automated test will likely use a local mock for Header/Cookie verification.
  • Manual JA3 Verification: A separate script tests/manual/verify_tls.py will be created to hit an external service (e.g., https://tls.peet.ws/api/all) to print and compare JA3 hashes from both Camoufox and CurlClient. This addresses the "High Risk" feedback by acknowledging external dependency for true TLS verification.

Verification Plan

Automated Tests

  1. Mock Server Test:
    • Start a local server that captures headers.
    • Run the E2E script.
    • Assert that both Browser and Client requests look identical (or sufficiently similar).

Manual Verification

  • Run docker-compose up and execute a manual script inside the orchestrator container to trigger the flow.