FAEA/docs/MVP_Scope.md
2025-12-22 17:14:46 +08:00

2.6 KiB

FAEA Phase 1: Foundation (MVP Scope)

Document Version: 1.0
Status: DRAFT
Owner: Product Manager

1. Executive Summary

Phase 1 focuses on the critical "Headless-Plus" extraction capability. The goal is to prove that Camoufox can authenticate and curl_cffi can reuse that session to extract data from a protected target (e.g., Cloudflare-protected dummy site) without detection.

Success Criteria:

  • 90%+ Authentication Success Rate on standard challenges.
  • 0% Fingerprint Mismatches between Browser and Extractor.
  • Sustained 1 RPS extraction for 20 minutes/session.

2. In-Scope (Must Have)

2.1 Core "Headless-Plus" Pipeline

  • BrowserAuth: Camoufox instance capable of solving Turnstile/JS challenges.
  • Handover: Secure serialization of Cookies, LocalStorage, and User-Agent to Redis.
  • Extractor: curl_cffi client configured to exactly match the Browser's TLS/Header fingerprint.

2.2 Infrastructure

  • Docker Compose: Local orchestration of Orchestrator, Redis, Camoufox, and Curl containers.
  • SessionStore: Redis-backed, encrypted state storage.

2.3 Evasion Basics

  • GhostCursor: Non-linear, Bezier-curve mouse movements.
  • EntropyScheduler: Gaussian-distributed delays (no fixed sleep times).
  • MobileProxy: Basic integration for residential/mobile IP rotation.

3. Out-of-Scope (Deferred to Phase 2)

  • Distributed/Multi-node Swarm orchestration.
  • Computer Vision/AI-based CAPTCHA solving (use standard click-to-solve).
  • Machine Learning-based behavior generation (use algorithmic heuristics).
  • Complex Dashboard/Reporting UI (use Prometheus metrics + logs).

4. Technical Constraints (DevOps)

  • Language: Python 3.11+
  • Protocol: HTTP/2 only (for fingerprint consistency).
  • State: Msgpack serialization for compactness.

5. Tech Lead Review

Reviewer: @skills/tech-lead
Status: APPROVED

Comments:

  • "Handover Protocol" via Redis/MessagePack is feasible and aligns with TDD Section 3.4.
  • curl_cffi supports the required impersonate kwarg for TLS consistency.
  • Constraint: Ensure browser_pool reclaims memory aggressively; standard Camoufox instances are RAM-heavy (2GB+).

6. Engineering Director Sign-off

Reviewer: @skills/engineering-director
Status: APPROVED (GO)

Comments:

  • MVP Scope strikes the right balance between Evasion (Headless-Plus) and Safety (Managed Infrastructure).
  • Risk: Rate limits on residential proxies. Monitoring for 429 Too Many Requests is critical for early detection of burned IPs.
  • Decision: Phase 1: Foundation is OPEN. Proceed to assignment.