226 lines
7.7 KiB
Text
226 lines
7.7 KiB
Text
Metadata-Version: 2.1
|
|
Name: curl-cffi
|
|
Version: 0.5.10
|
|
Summary: libcurl ffi bindings for Python, with impersonation support
|
|
Author-email: Yifei Kong <kong@yifei.me>
|
|
License: MIT License
|
|
|
|
Copyright (c) 2018 multippt
|
|
Copyright (c) 2022 Yifei Kong
|
|
|
|
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
of this software and associated documentation files (the "Software"), to deal
|
|
in the Software without restriction, including without limitation the rights
|
|
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
copies of the Software, and to permit persons to whom the Software is
|
|
furnished to do so, subject to the following conditions:
|
|
|
|
The above copyright notice and this permission notice shall be included in all
|
|
copies or substantial portions of the Software.
|
|
|
|
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
SOFTWARE.
|
|
|
|
Project-URL: repository, https://github.com/yifeikong/curl_cffi
|
|
Classifier: Development Status :: 4 - Beta
|
|
Classifier: Intended Audience :: Developers
|
|
Classifier: License :: OSI Approved :: MIT License
|
|
Classifier: Programming Language :: Python :: 3
|
|
Classifier: Programming Language :: Python :: 3.7
|
|
Classifier: Programming Language :: Python :: 3.8
|
|
Classifier: Programming Language :: Python :: 3.9
|
|
Classifier: Programming Language :: Python :: 3.10
|
|
Classifier: Programming Language :: Python :: 3.11
|
|
Classifier: Programming Language :: Python :: 3.12
|
|
Requires-Python: >=3.7
|
|
Description-Content-Type: text/markdown
|
|
License-File: LICENSE
|
|
Requires-Dist: cffi >=1.12.0
|
|
Provides-Extra: build
|
|
Requires-Dist: cibuildwheel ; extra == 'build'
|
|
Requires-Dist: wheel ; extra == 'build'
|
|
Provides-Extra: dev
|
|
Requires-Dist: autoflake ==1.4 ; extra == 'dev'
|
|
Requires-Dist: black ==22.8.0 ; extra == 'dev'
|
|
Requires-Dist: coverage ==6.4.1 ; extra == 'dev'
|
|
Requires-Dist: cryptography ==38.0.3 ; extra == 'dev'
|
|
Requires-Dist: flake8 ==6.0.0 ; extra == 'dev'
|
|
Requires-Dist: flake8-bugbear ==22.7.1 ; extra == 'dev'
|
|
Requires-Dist: flake8-pie ==0.15.0 ; extra == 'dev'
|
|
Requires-Dist: httpx ==0.23.1 ; extra == 'dev'
|
|
Requires-Dist: isort ==5.10.1 ; extra == 'dev'
|
|
Requires-Dist: mypy ==0.971 ; extra == 'dev'
|
|
Requires-Dist: types-certifi ==2021.10.8.2 ; extra == 'dev'
|
|
Requires-Dist: pytest ==7.1.2 ; extra == 'dev'
|
|
Requires-Dist: pytest-asyncio ==0.19.0 ; extra == 'dev'
|
|
Requires-Dist: pytest-trio ==0.7.0 ; extra == 'dev'
|
|
Requires-Dist: trio ==0.21.0 ; extra == 'dev'
|
|
Requires-Dist: trio-typing ==0.7.0 ; extra == 'dev'
|
|
Requires-Dist: trustme ==0.9.0 ; extra == 'dev'
|
|
Requires-Dist: uvicorn ==0.18.3 ; extra == 'dev'
|
|
Provides-Extra: test
|
|
Requires-Dist: cryptography ==38.0.3 ; extra == 'test'
|
|
Requires-Dist: httpx ==0.23.1 ; extra == 'test'
|
|
Requires-Dist: types-certifi ==2021.10.8.2 ; extra == 'test'
|
|
Requires-Dist: pytest ==7.1.2 ; extra == 'test'
|
|
Requires-Dist: pytest-asyncio ==0.19.0 ; extra == 'test'
|
|
Requires-Dist: pytest-trio ==0.7.0 ; extra == 'test'
|
|
Requires-Dist: trio ==0.21.0 ; extra == 'test'
|
|
Requires-Dist: trio-typing ==0.7.0 ; extra == 'test'
|
|
Requires-Dist: trustme ==0.9.0 ; extra == 'test'
|
|
Requires-Dist: uvicorn ==0.18.3 ; extra == 'test'
|
|
|
|
# curl_cffi
|
|
|
|
Python binding for [curl-impersonate](https://github.com/lwthiker/curl-impersonate)
|
|
via [cffi](https://cffi.readthedocs.io/en/latest/).
|
|
|
|
[Documentation](https://curl-cffi.readthedocs.io) | [中文 README](https://github.com/yifeikong/curl_cffi/blob/master/README-zh.md)
|
|
|
|
Unlike other pure python http clients like `httpx` or `requests`, `curl_cffi` can
|
|
impersonate browsers' TLS signatures or JA3 fingerprints. If you are blocked by some
|
|
website for no obvious reason, you can give this package a try.
|
|
|
|
## Features
|
|
|
|
- Supports JA3/TLS and http2 fingerprints impersonation.
|
|
- Much faster than requests/httpx, on par with aiohttp/pycurl, see [benchmarks](https://github.com/yifeikong/curl_cffi/tree/master/benchmark).
|
|
- Mimics requests API, no need to learn another one.
|
|
- Pre-compiled, so you don't have to compile on your machine.
|
|
- Supports `asyncio` with proxy rotation on each request.
|
|
- Supports http 2.0, which requests does not.
|
|
|
|
|library|requests|aiohttp|httpx|pycurl|curl_cffi|
|
|
|---|---|---|---|---|---|
|
|
|http2|❌|❌|✅|✅|✅|
|
|
|sync|✅|❌|✅|✅|✅|
|
|
|async|❌|✅|✅|❌|✅|
|
|
|fingerprints|❌|❌|❌|❌|✅|
|
|
|speed|🐇|🐇🐇|🐇|🐇🐇|🐇🐇|
|
|
|
|
## Install
|
|
|
|
pip install curl_cffi --upgrade
|
|
|
|
This should work on Linux(x86_64/aarch64), macOS(Intel/Apple Silicon) and Windows(amd64).
|
|
If it does not work on you platform, you may need to compile and install `curl-impersonate`
|
|
first and set some environment variables like `LD_LIBRARY_PATH`.
|
|
|
|
To install beta releases:
|
|
|
|
pip install curl_cffi --pre
|
|
|
|
## Usage
|
|
|
|
### requests-like
|
|
|
|
```python
|
|
from curl_cffi import requests
|
|
|
|
# Notice the impersonate parameter
|
|
r = requests.get("https://tls.browserleaks.com/json", impersonate="chrome110")
|
|
|
|
print(r.json())
|
|
# output: {..., "ja3n_hash": "aa56c057ad164ec4fdcb7a5a283be9fc", ...}
|
|
# the js3n fingerprint should be the same as target browser
|
|
|
|
# http/socks proxies are supported
|
|
proxies = {"https": "http://localhost:3128"}
|
|
r = requests.get("https://tls.browserleaks.com/json", impersonate="chrome110", proxies=proxies)
|
|
|
|
proxies = {"https": "socks://localhost:3128"}
|
|
r = requests.get("https://tls.browserleaks.com/json", impersonate="chrome110", proxies=proxies)
|
|
```
|
|
|
|
### Sessions
|
|
|
|
```python
|
|
# sessions are supported
|
|
s = requests.Session()
|
|
# httpbin is a http test website
|
|
s.get("https://httpbin.org/cookies/set/foo/bar")
|
|
print(s.cookies)
|
|
# <Cookies[<Cookie foo=bar for httpbin.org />]>
|
|
r = s.get("https://httpbin.org/cookies")
|
|
print(r.json())
|
|
# {'cookies': {'foo': 'bar'}}
|
|
```
|
|
|
|
Supported impersonate versions, as supported by [curl-impersonate](https://github.com/lwthiker/curl-impersonate):
|
|
|
|
- chrome99
|
|
- chrome100
|
|
- chrome101
|
|
- chrome104
|
|
- chrome107
|
|
- chrome110
|
|
- chrome99_android
|
|
- edge99
|
|
- edge101
|
|
- safari15_3
|
|
- safari15_5
|
|
|
|
### asyncio
|
|
|
|
```python
|
|
from curl_cffi.requests import AsyncSession
|
|
|
|
async with AsyncSession() as s:
|
|
r = await s.get("https://example.com")
|
|
```
|
|
|
|
More concurrency:
|
|
|
|
```python
|
|
import asyncio
|
|
from curl_cffi.requests import AsyncSession
|
|
|
|
urls = [
|
|
"https://googel.com/",
|
|
"https://facebook.com/",
|
|
"https://twitter.com/",
|
|
]
|
|
|
|
async with AsyncSession() as s:
|
|
tasks = []
|
|
for url in urls:
|
|
task = s.get("https://example.com")
|
|
tasks.append(task)
|
|
results = await asyncio.gather(*tasks)
|
|
```
|
|
|
|
### curl-like
|
|
|
|
Alternatively, you can use the low-level curl-like API:
|
|
|
|
```python
|
|
from curl_cffi import Curl, CurlOpt
|
|
from io import BytesIO
|
|
|
|
buffer = BytesIO()
|
|
c = Curl()
|
|
c.setopt(CurlOpt.URL, b'https://tls.browserleaks.com/json')
|
|
c.setopt(CurlOpt.WRITEDATA, buffer)
|
|
|
|
c.impersonate("chrome110")
|
|
|
|
c.perform()
|
|
c.close()
|
|
body = buffer.getvalue()
|
|
print(body.decode())
|
|
```
|
|
|
|
See the [docs](https://curl-cffi.readthedocs.io) for more details.
|
|
|
|
If you are using scrapy, check out this middleware: [tieyongjie/scrapy-fingerprint](https://github.com/tieyongjie/scrapy-fingerprint)
|
|
|
|
## Acknowledgement
|
|
|
|
- Originally forked from [multippt/python_curl_cffi](https://github.com/multippt/python_curl_cffi), which is under the MIT license.
|
|
- Headers/Cookies files are copied from [httpx](https://github.com/encode/httpx/blob/master/httpx/_models.py), which is under the BSD license.
|
|
- Asyncio support is inspired by Tornado's curl http client.
|