343 lines
15 KiB
Text
343 lines
15 KiB
Text
Metadata-Version: 2.4
|
||
Name: curl_cffi
|
||
Version: 0.14.0
|
||
Summary: libcurl ffi bindings for Python, with impersonation support.
|
||
Author-email: lexiforest <infinitesheldon@gmail.com>
|
||
License: MIT License
|
||
Project-URL: repository, https://github.com/lexiforest/curl_cffi
|
||
Classifier: Development Status :: 4 - Beta
|
||
Classifier: Intended Audience :: Developers
|
||
Classifier: Programming Language :: Python :: 3
|
||
Classifier: Programming Language :: Python :: 3.10
|
||
Classifier: Programming Language :: Python :: 3.11
|
||
Classifier: Programming Language :: Python :: 3.12
|
||
Classifier: Programming Language :: Python :: 3.13
|
||
Classifier: Programming Language :: Python :: 3.14
|
||
Requires-Python: >=3.10
|
||
Description-Content-Type: text/markdown
|
||
License-File: LICENSE
|
||
Requires-Dist: cffi>=1.12.0
|
||
Requires-Dist: certifi>=2024.2.2
|
||
Provides-Extra: extra
|
||
Requires-Dist: readability-lxml>=0.8.1; extra == "extra"
|
||
Requires-Dist: markdownify>=1.1.0; extra == "extra"
|
||
Requires-Dist: lxml_html_clean; extra == "extra"
|
||
Provides-Extra: dev
|
||
Requires-Dist: charset_normalizer<4.0,>=3.3.2; extra == "dev"
|
||
Requires-Dist: coverage<7.0,>=6.4.1; extra == "dev"
|
||
Requires-Dist: cryptography<43.0,>=42.0.5; extra == "dev"
|
||
Requires-Dist: httpx==0.23.1; extra == "dev"
|
||
Requires-Dist: mypy<2.0,>=1.9.0; extra == "dev"
|
||
Requires-Dist: pytest<9.0,>=8.1.1; extra == "dev"
|
||
Requires-Dist: pytest-asyncio<1.0,>=0.23.6; extra == "dev"
|
||
Requires-Dist: pytest-trio<1.0,>=0.8.0; extra == "dev"
|
||
Requires-Dist: ruff<1.0,>=0.3.5; extra == "dev"
|
||
Requires-Dist: trio<1.0,>=0.25.0; extra == "dev"
|
||
Requires-Dist: trustme<2.0,>=1.1.0; extra == "dev"
|
||
Requires-Dist: uvicorn<1.0,>=0.29.0; extra == "dev"
|
||
Requires-Dist: websockets>=14.0; extra == "dev"
|
||
Requires-Dist: typing_extensions; extra == "dev"
|
||
Provides-Extra: build
|
||
Requires-Dist: cibuildwheel; extra == "build"
|
||
Requires-Dist: wheel; extra == "build"
|
||
Provides-Extra: test
|
||
Requires-Dist: charset_normalizer<4.0,>=3.3.2; extra == "test"
|
||
Requires-Dist: cryptography<43.0,>=42.0.5; extra == "test"
|
||
Requires-Dist: fastapi<1.0,>=0.110.0; extra == "test"
|
||
Requires-Dist: httpx==0.23.1; extra == "test"
|
||
Requires-Dist: proxy.py<3.0,>=2.4.3; extra == "test"
|
||
Requires-Dist: pytest<9.0,>=8.1.1; extra == "test"
|
||
Requires-Dist: pytest-asyncio<1.0,>=0.23.6; extra == "test"
|
||
Requires-Dist: pytest-trio<1.0,>=0.8.0; extra == "test"
|
||
Requires-Dist: python-multipart<1.0,>=0.0.9; extra == "test"
|
||
Requires-Dist: trio<1.0,>=0.25.0; extra == "test"
|
||
Requires-Dist: trustme<2.0,>=1.1.0; extra == "test"
|
||
Requires-Dist: uvicorn<1.0,>=0.29.0; extra == "test"
|
||
Requires-Dist: websockets>=14.0; extra == "test"
|
||
Requires-Dist: typing_extensions; extra == "test"
|
||
Dynamic: license-file
|
||
|
||
# curl_cffi
|
||
|
||
[](https://pepy.tech/projects/curl-cffi)
|
||

|
||
[](https://badge.fury.io/py/curl-cffi)
|
||
[](https://t.me/real_curl_cffi)
|
||
[](https://discord.gg/kJqMHHgdn2)
|
||
|
||
[Documentation](https://curl-cffi.readthedocs.io)
|
||
|
||
Python binding for [curl-impersonate fork](https://github.com/lexiforest/curl-impersonate)
|
||
via [cffi](https://cffi.readthedocs.io/en/latest/). For commercial support, visit [impersonate.pro](https://impersonate.pro).
|
||
|
||
`curl_cffi` is the most popular Python binding for `curl`. Unlike other pure
|
||
python http clients like `httpx` or `requests`, `curl_cffi` can impersonate
|
||
browsers' TLS/JA3 and HTTP/2 fingerprints. If you are blocked by some
|
||
website for no obvious reason, you can give `curl_cffi` a try.
|
||
|
||
Python 3.9 is the minimum supported version since v0.10.
|
||
|
||
## Recall.ai - API for meeting recordings
|
||
|
||
<a href="https://www.recall.ai/?utm_source=github&utm_medium=sponsorship&utm_campaign=lexiforest-curl_cffi" target="_blank"><img src="https://cdn.prod.website-files.com/620d732b1f1f7b244ac89f0e/66b294e51ee15f18dd2b171e_recall-logo.svg" alt="Recall.ai" height="47" width="149"></a>
|
||
|
||
If you’re looking for a meeting recording API, consider checking out [Recall.ai](https://www.recall.ai/?utm_source=github&utm_medium=sponsorship&utm_campaign=lexiforest-curl_cffi), an API that records Zoom, Google Meet, Microsoft Teams, in-person meetings, and more.
|
||
|
||
## Residential Proxies
|
||
|
||
<a href="https://www.thordata.com/?ls=github&lk=curl_cffi" target="_blank"><img src="https://raw.githubusercontent.com/lexiforest/curl_cffi/main/assets/thordata.png" alt="Thordata" height="126" width="240"></a>
|
||
|
||
Thordata: A reliable and cost-effective proxy service provider. One-click collection of public network data, providing enterprises and developers with stable, efficient, and compliant global proxy IP services. Register for a free trial of [residential proxies](https://www.thordata.com/products/residential-proxies/?ls=github&lk=curl_cffi) and receive 2000 free SERP API calls.
|
||
|
||
## Sponsors
|
||
|
||
Maintenance of this project is made possible by all the <a href="https://github.com/lexiforest/curl_cffi/graphs/contributors">contributors</a> and <a href="https://github.com/sponsors/lexiforest">sponsors</a>. If you'd like to sponsor this project and have your avatar or company logo appear below <a href="https://github.com/sponsors/lexiforest">click here</a>. 💖
|
||
|
||
------
|
||
|
||
### Bypass Cloudflare with API
|
||
|
||
<a href="https://yescaptcha.com/i/stfnIO" target="_blank"><img src="https://raw.githubusercontent.com/lexiforest/curl_cffi/main/assets/yescaptcha.png" alt="Yes Captcha!" height="47" width="149"></a>
|
||
|
||
Yescaptcha is a proxy service that bypasses Cloudflare and uses the API interface to
|
||
obtain verified cookies (e.g. `cf_clearance`). Click [here](https://yescaptcha.com/i/stfnIO)
|
||
to register: https://yescaptcha.com/i/stfnIO
|
||
|
||
------
|
||
|
||
<a href="https://hypersolutions.co/?utm_source=github&utm_medium=readme&utm_campaign=curl_cffi" target="_blank"><img src="https://raw.githubusercontent.com/lexiforest/curl_cffi/main/assets/hypersolutions.png" height="47" width="149"></a>
|
||
|
||
TLS fingerprinting alone isn't enough for modern bot protection. [Hyper Solutions](https://hypersolutions.co?utm_source=github&utm_medium=readme&utm_campaign=curl_cffi) provides the missing piece - API endpoints that generate valid antibot tokens for:
|
||
|
||
Akamai • DataDome • Kasada • Incapsula
|
||
|
||
No browser automation. Just simple API calls that return the exact cookies and headers these systems require.
|
||
|
||
🚀 [Get Your API Key](https://hypersolutions.co?utm_source=github&utm_medium=readme&utm_campaign=curl_cffi) | 📖 [Docs](https://docs.justhyped.dev) | 💬 [Discord](https://discord.gg/akamai)
|
||
|
||
------
|
||
|
||
## Features
|
||
|
||
- Supports JA3/TLS and http2 fingerprints impersonation, including recent browsers and custom fingerprints.
|
||
- Much faster than requests/httpx, on par with aiohttp/pycurl, see [benchmarks](https://github.com/lexiforest/curl_cffi/tree/main/benchmark).
|
||
- Mimics the requests API, no need to learn another one.
|
||
- Pre-compiled, so you don't have to compile on your machine.
|
||
- Supports `asyncio` with proxy rotation on each request.
|
||
- Supports http 2.0 & 3.0, which requests does not.
|
||
- Supports websocket.
|
||
- MIT licensed.
|
||
|
||
||requests|aiohttp|httpx|pycurl|curl_cffi|
|
||
|---|---|---|---|---|---|
|
||
|http/2|❌|❌|✅|✅|✅|
|
||
|http/3|❌|❌|❌|☑️<sup>1</sup>|✅<sup>2</sup>|
|
||
|sync|✅|❌|✅|✅|✅|
|
||
|async|❌|✅|✅|❌|✅|
|
||
|websocket|❌|✅|❌|❌|✅|
|
||
|fingerprints|❌|❌|❌|❌|✅|
|
||
|speed|🐇|🐇🐇|🐇|🐇🐇|🐇🐇|
|
||
|
||
Notes:
|
||
1. For pycurl, you need an http/3 enabled libcurl to make it work, while curl_cffi packages libcurl-impersonate inside Python wheels.
|
||
2. Since v0.11.4.
|
||
|
||
## Install
|
||
|
||
pip install curl_cffi --upgrade
|
||
|
||
This should work on Linux, macOS and Windows out of the box.
|
||
If it does not work on you platform, you may need to compile and install `curl-impersonate`
|
||
first and set some environment variables like `LD_LIBRARY_PATH`.
|
||
|
||
To install beta releases:
|
||
|
||
pip install curl_cffi --upgrade --pre
|
||
|
||
To install unstable version from GitHub:
|
||
|
||
git clone https://github.com/lexiforest/curl_cffi/
|
||
cd curl_cffi
|
||
make preprocess
|
||
pip install .
|
||
|
||
On macOS, you may need to install the following dependencies:
|
||
|
||
brew install zstd nghttp2
|
||
|
||
## Usage
|
||
|
||
`curl_cffi` comes with a low-level `curl` API and a high-level `requests`-like API.
|
||
|
||
### requests-like
|
||
|
||
|
||
```python
|
||
import curl_cffi
|
||
|
||
# Notice the impersonate parameter
|
||
r = curl_cffi.get("https://tls.browserleaks.com/json", impersonate="chrome")
|
||
|
||
print(r.json())
|
||
# output: {..., "ja3n_hash": "aa56c057ad164ec4fdcb7a5a283be9fc", ...}
|
||
# the js3n fingerprint should be the same as target browser
|
||
|
||
# To keep using the latest browser version as `curl_cffi` updates,
|
||
# simply set impersonate="chrome" without specifying a version.
|
||
# Other similar values are: "safari" and "safari_ios"
|
||
r = curl_cffi.get("https://tls.browserleaks.com/json", impersonate="chrome")
|
||
|
||
# Randomly choose a browser version based on current market share in real world
|
||
# from: https://caniuse.com/usage-table
|
||
# NOTE: this is a pro feature.
|
||
r = curl_cffi.get("https://example.com", impersonate="realworld")
|
||
|
||
# To pin a specific version, use version numbers together.
|
||
r = curl_cffi.get("https://tls.browserleaks.com/json", impersonate="chrome124")
|
||
|
||
# To impersonate other than browsers, bring your own ja3/akamai strings
|
||
# See examples directory for details.
|
||
r = curl_cffi.get("https://tls.browserleaks.com/json", ja3=..., akamai=...)
|
||
|
||
# http/socks proxies are supported
|
||
proxies = {"https": "http://localhost:3128"}
|
||
r = curl_cffi.get("https://tls.browserleaks.com/json", impersonate="chrome", proxies=proxies)
|
||
|
||
proxies = {"https": "socks://localhost:3128"}
|
||
r = curl_cffi.get("https://tls.browserleaks.com/json", impersonate="chrome", proxies=proxies)
|
||
```
|
||
|
||
### Sessions
|
||
|
||
```python
|
||
s = curl_cffi.Session()
|
||
|
||
# httpbin is a http test website, this endpoint makes the server set cookies
|
||
s.get("https://httpbin.org/cookies/set/foo/bar")
|
||
print(s.cookies)
|
||
# <Cookies[<Cookie foo=bar for httpbin.org />]>
|
||
|
||
# retrieve cookies again to verify
|
||
r = s.get("https://httpbin.org/cookies")
|
||
print(r.json())
|
||
# {'cookies': {'foo': 'bar'}}
|
||
```
|
||
|
||
### Supported impersonate browsers
|
||
|
||
`curl_cffi` supports the same browser versions as supported by my [fork](https://github.com/lexiforest/curl-impersonate) of [curl-impersonate](https://github.com/lwthiker/curl-impersonate):
|
||
|
||
Open source version of curl_cffi includes versions whose fingerprints differ from previous versions.
|
||
If you see a version, e.g. `chrome135`, were skipped, you can simply impersonate it with your own headers and the previous version.
|
||
|
||
If you don't want to look up the headers etc, by yourself, consider buying commercial support from [impersonate.pro](https://impersonate.pro),
|
||
we have comprehensive browser fingerprints database for almost all the browser versions on various platforms.
|
||
|
||
If you are trying to impersonate a target other than a browser, use `ja3=...` and `akamai=...`
|
||
to specify your own customized fingerprints. See the [docs on impersonation](https://curl-cffi.readthedocs.io/en/latest/impersonate/_index.html) for details.
|
||
|
||
|Browser|Open Source| Pro version|
|
||
|---|---|---|
|
||
|Chrome|chrome99, chrome100, chrome101, chrome104, chrome107, chrome110, chrome116<sup>[1]</sup>, chrome119<sup>[1]</sup>, chrome120<sup>[1]</sup>, chrome123<sup>[3]</sup>, chrome124<sup>[3]</sup>, chrome131<sup>[4]</sup>, chrome133a<sup>[5][6]</sup>, chrome136<sup>[6]</sup>|chrome132, chrome134, chrome135|
|
||
|Chrome Android| chrome99_android, chrome131_android <sup>[4]</sup>|chrome132_android, chrome133_android, chrome134_android, chrome135_android|
|
||
|Chrome iOS|N/A|coming soon|
|
||
|Safari <sup>[7]</sup>|safari153 <sup>[2]</sup>, safari155 <sup>[2]</sup>, safari170 <sup>[1]</sup>, safari180 <sup>[4]</sup>, safari184 <sup>[6]</sup>, safari260 <sup>[8]</sup>|coming soon|
|
||
|Safari iOS <sup>[7]</sup>| safari172_ios<sup>[1]</sup>, safari180_ios<sup>[4]</sup>, safari184_ios <sup>[6]</sup>, safari260_ios <sup>[8]</sup>|coming soon|
|
||
|Firefox|firefox133<sup>[5]</sup>, firefox135<sup>[7]</sup>|coming soon|
|
||
|Firefox Android|N/A|firefox135_android|
|
||
|Tor|tor145 <sup>[7]</sup>|coming soon|
|
||
|Edge|edge99, edge101|edge133, edge135|
|
||
|Opera|N/A|coming soon|
|
||
|Brave|N/A|coming soon|
|
||
|
||
|
||
Notes:
|
||
1. Added in version `0.6.0`.
|
||
2. Fixed in version `0.6.0`, previous http2 fingerprints were [not correct](https://github.com/lwthiker/curl-impersonate/issues/215).
|
||
3. Added in version `0.7.0`.
|
||
4. Added in version `0.8.0`.
|
||
5. Added in version `0.9.0`.
|
||
6. The version postfix `-a`(e.g. `chrome133a`) means that this is an alternative version, i.e. the fingerprint has not been officially updated by browser, but has been observed because of A/B testing.
|
||
5. Added in version `0.10.0`.
|
||
6. Added in version `0.11.0`.
|
||
7. Since `0.11.0`, the format `safari184_ios` is preferred over `safari18_4_ios`, both are supported, but the latter is quite confusing and hard to parse.
|
||
8. Added in `0.12.0`.
|
||
|
||
### Asyncio
|
||
|
||
```python
|
||
from curl_cffi import AsyncSession
|
||
|
||
async with AsyncSession() as s:
|
||
r = await s.get("https://example.com")
|
||
```
|
||
|
||
More concurrency:
|
||
|
||
```python
|
||
import asyncio
|
||
from curl_cffi import AsyncSession
|
||
|
||
urls = [
|
||
"https://google.com/",
|
||
"https://facebook.com/",
|
||
"https://twitter.com/",
|
||
]
|
||
|
||
async with AsyncSession() as s:
|
||
tasks = []
|
||
for url in urls:
|
||
task = s.get(url)
|
||
tasks.append(task)
|
||
results = await asyncio.gather(*tasks)
|
||
```
|
||
|
||
For low-level APIs, Scrapy integration and other advanced topics, see the
|
||
[docs](https://curl-cffi.readthedocs.io) for more details.
|
||
|
||
|
||
### WebSockets
|
||
|
||
```python
|
||
from curl_cffi import WebSocket
|
||
|
||
def on_message(ws: WebSocket, message: str | bytes):
|
||
print(message)
|
||
|
||
ws = WebSocket(on_message=on_message)
|
||
ws.run_forever("wss://api.gemini.com/v1/marketdata/BTCUSD")
|
||
```
|
||
|
||
### Asyncio WebSockets
|
||
|
||
```python
|
||
import asyncio
|
||
from curl_cffi import AsyncSession
|
||
|
||
async with AsyncSession() as s:
|
||
ws = await s.ws_connect("wss://echo.websocket.org")
|
||
await asyncio.gather(*[ws.send_str("Hello, World!") for _ in range(10)])
|
||
await ws.flush()
|
||
async for message in ws:
|
||
print(message)
|
||
await ws.close()
|
||
```
|
||
|
||
## Ecosystem
|
||
|
||
- Integrating with Scrapy: [divtiply/scrapy-curl-cffi](https://github.com/divtiply/scrapy-curl-cffi), [jxlil/scrapy-impersonate](https://github.com/jxlil/scrapy-impersonate) and [tieyongjie/scrapy-fingerprint](https://github.com/tieyongjie/scrapy-fingerprint).
|
||
- Integrating with [requests](https://github.com/el1s7/curl-adapter), [httpx](https://github.com/vgavro/httpx-curl-cffi) as adapter.
|
||
- Integrating with captcha resolvers: [YesCaptcha](https://yescaptcha.atlassian.net/wiki/spaces/YESCAPTCHA/overview). Please see the head area for promo code and link.
|
||
|
||
## Acknowledgement
|
||
|
||
- Originally forked from [multippt/python_curl_cffi](https://github.com/multippt/python_curl_cffi), which is under the MIT license.
|
||
- Headers/Cookies files are copied from [httpx](https://github.com/encode/httpx/blob/master/httpx/_models.py), which is under the BSD license.
|
||
- Asyncio support is inspired by Tornado's curl http client.
|
||
- The synchronous WebSocket API is inspired by [websocket_client](https://github.com/websocket-client/websocket-client).
|
||
- The asynchronous WebSocket API is inspired by [aiohttp](https://github.com/aio-libs/aiohttp).
|
||
|
||
## Contributing
|
||
|
||
When submitting an PR, please use a different branch other than `main` and check the
|
||
"Allow edits by maintainers" box, so I can update your PR with lint or style fixes. Thanks!
|