64 lines
2.6 KiB
Markdown
64 lines
2.6 KiB
Markdown
# FAEA Phase 1: Foundation (MVP Scope)
|
|
|
|
**Document Version:** 1.0
|
|
**Status:** DRAFT
|
|
**Owner:** Product Manager
|
|
|
|
## 1. Executive Summary
|
|
Phase 1 focuses on the critical "Headless-Plus" extraction capability. The goal is to prove that **Camoufox** can authenticate and **curl_cffi** can reuse that session to extract data from a protected target (e.g., Cloudflare-protected dummy site) without detection.
|
|
|
|
**Success Criteria:**
|
|
- [ ] 90%+ Authentication Success Rate on standard challenges.
|
|
- [ ] 0% Fingerprint Mismatches between Browser and Extractor.
|
|
- [ ] Sustained 1 RPS extraction for 20 minutes/session.
|
|
|
|
## 2. In-Scope (Must Have)
|
|
|
|
### 2.1 Core "Headless-Plus" Pipeline
|
|
- **BrowserAuth:** Camoufox instance capable of solving Turnstile/JS challenges.
|
|
- **Handover:** Secure serialization of Cookies, LocalStorage, and User-Agent to Redis.
|
|
- **Extractor:** `curl_cffi` client configured to *exactly* match the Browser's TLS/Header fingerprint.
|
|
|
|
### 2.2 Infrastructure
|
|
- **Docker Compose:** Local orchestration of Orchestrator, Redis, Camoufox, and Curl containers.
|
|
- **SessionStore:** Redis-backed, encrypted state storage.
|
|
|
|
### 2.3 Evasion Basics
|
|
- **GhostCursor:** Non-linear, Bezier-curve mouse movements.
|
|
- **EntropyScheduler:** Gaussian-distributed delays (no fixed sleep times).
|
|
- **MobileProxy:** Basic integration for residential/mobile IP rotation.
|
|
|
|
## 3. Out-of-Scope (Deferred to Phase 2)
|
|
- ❌ Distributed/Multi-node Swarm orchestration.
|
|
- ❌ Computer Vision/AI-based CAPTCHA solving (use standard click-to-solve).
|
|
- ❌ Machine Learning-based behavior generation (use algorithmic heuristics).
|
|
- ❌ Complex Dashboard/Reporting UI (use Prometheus metrics + logs).
|
|
|
|
## 4. Technical Constraints (DevOps)
|
|
- **Language:** Python 3.11+
|
|
- **Protocol:** HTTP/2 only (for fingerprint consistency).
|
|
- **State:** Msgpack serialization for compactness.
|
|
|
|
---
|
|
|
|
## 5. Tech Lead Review
|
|
**Reviewer:** @skills/tech-lead
|
|
**Status:** APPROVED
|
|
|
|
**Comments:**
|
|
- "Handover Protocol" via Redis/MessagePack is feasible and aligns with TDD Section 3.4.
|
|
- `curl_cffi` supports the required `impersonate` kwarg for TLS consistency.
|
|
- **Constraint:** Ensure `browser_pool` reclaims memory aggressively; standard Camoufox instances are RAM-heavy (2GB+).
|
|
|
|
|
|
---
|
|
|
|
## 6. Engineering Director Sign-off
|
|
**Reviewer:** @skills/engineering-director
|
|
**Status:** APPROVED (GO)
|
|
|
|
**Comments:**
|
|
- MVP Scope strikes the right balance between Evasion (Headless-Plus) and Safety (Managed Infrastructure).
|
|
- **Risk:** Rate limits on residential proxies. Monitoring for `429 Too Many Requests` is critical for early detection of burned IPs.
|
|
- **Decision:** Phase 1: Foundation is **OPEN**. Proceed to assignment.
|
|
|