Requests
requests
模块 主要用于发送请求获取响应,该模块有很多的替代模块,比如 urllib
模块 或 httpx
模块等。
文档地址:Requests Quickstart
安装
requests 模块是一个第三方模块,需要在 python (虚拟)环境中额外安装,使用下面的命令直接在终端中安装:
pip install requests
安装 requests 输出
➜ pip install requests
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple/, http://pypi.douban.com/simple/, http://mirrors.aliyun.com/pypi/simple/
Collecting requests
Downloading http://mirrors.aliyun.com/pypi/packages/ca/91/6d9b8ccacd0412c08820f72cebaa4f0c0441b5cda699c90f618b6f8a1b42/requests-2.28.1-py3-none-any.whl (62 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.8/62.8 kB 1.0 MB/s eta 0:00:00
Collecting idna<4,>=2.5
Downloading http://mirrors.aliyun.com/pypi/packages/04/a2/d918dcd22354d8958fe113e1a3630137e0fc8b44859ade3063982eacd2a4/idna-3.3-py3-none-any.whl (61 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.2/61.2 kB 973.2 kB/s eta 0:00:00
Collecting certifi>=2017.4.17
Downloading http://mirrors.aliyun.com/pypi/packages/e9/06/d3d367b7af6305b16f0d28ae2aaeb86154fa91f144f036c2d5002a5a202b/certifi-2022.6.15-py3-none-any.whl (160 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 160.2/160.2 kB 1.8 MB/s eta 0:00:00
Collecting charset-normalizer<3,>=2
Downloading http://mirrors.aliyun.com/pypi/packages/94/69/64b11e8c2fb21f08634468caef885112e682b0ebe2908e74d3616eb1c113/charset_normalizer-2.1.0-py3-none-any.whl (39 kB)
Collecting urllib3<1.27,>=1.21.1
Downloading http://mirrors.aliyun.com/pypi/packages/d1/cb/4783c8f1a90f89e260dbf72ebbcf25931f3a28f8f80e2e90f8a589941b19/urllib3-1.26.11-py2.py3-none-any.whl (139 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 139.9/139.9 kB 1.3 MB/s eta 0:00:00
Installing collected packages: urllib3, idna, charset-normalizer, certifi, requests
Successfully installed certifi-2022.6.15 charset-normalizer-2.1.0 idna-3.3 requests-2.28.1 urllib3-1.26.11
请求方法
import requests
# 简单请求方法 get、post、put、delete、head、options
r = requests.get('https://httpbin.org/get')
r = requests.post('https://httpbin.org/post', data={"key": "value"}) # 传递字典给请求主体
r = requests.put('https://httpbin.org/put', data={"key": "value"}) # 传递字典给请求主体
r = requests.delete('https://httpbin.org/delete')
r = requests.head('https://httpbin.org/get')
r = requests.options('https://httpbin.org/get')
发送请求
发送 GET
请求
import requests
r = requests.get('https://httpbin.org/json')
print(r.request.headers) # 获取响应对象的请求头
print(r.url) # 获取响应 URL
print(r.status_code) # 获取响应状态码
print(r.headers['content-type']) # 获取响应头 Content-Type
print(r.content) # 获取二进制响应内容
print(r.encoding) # 获取响应的编码字符集,也可以设置编码字符集
print(r.text) # 获取响应文本内容
print(r.json()) # 获取JSON输出内容
查看响应对象
使用 dir
函数查看响应对象
r = requests.get('https://httpbin.org/get')
print(dir(r))
查看属性详情
使用 help
函数查看属性详情
r = requests.get('https://httpbin.org/get')
print(help(r))
print(help(r.request))
自定义请求头
如果想将自定义的 HTTP 请求头添加到请求中,只需将 dict 字典传递给 headers
参数。
import requests
r = requests.get('https://httpbin.org/get', headers={'user-agent': 'awesome-app/1.0.0'})
print(r.request.headers)
传递参数
requests
支持直接将字典参数传递给 params
参数。
import requests
payload = {"key": "value", "key2": "value2"}
r = requests.get('https://httpbin.org/get', params=payload)
print(r.url) # https://httpbin.org/get?key2=value2&key1=value1
# 将列表作为值传递
payload = {'key1': 'value1', 'key2': ['value2', 'value3']}
r = requests.get('https://httpbin.org/get', params=payload)
print(r.url) # https://httpbin.org/get?key1=value1&key2=value2&key2=value3
任何值为
None
的字典键都不会添加到 URL 的查询字符串中。
请求主体
传递字典
发送 POST 请求时可以传递字典作为请求主体。
import requests
payload = {'username': 'curder', 'password': 'test'}
r = requests.post('https://httpbin.org/post', data=payload)
print(r.text)
r_dict = r.json()
print(r_dict.get('form')) # 返回响应字典
传递 json
import requests
url = 'https://httpbin.org/post'
payload = {'some': 'data'}
r = requests.post(url, json=payload)
print(r.text)
HTTP 身份验证
requests 可以发送 HTTP身份验证 的信息给服务端。
auth = ('curder', 'test')
r = requests.get('https://httpbin.org/basic-auth/curder/test', auth=auth)
print(r.ok)
请求超时时间
默认如果没有明确指定超时,则请求不会超时。
import requests
try:
r = requests.get('https://httpbin.org/delay/3', timeout=2) # 设置请求超时时间为 2 秒
except requests.ReadTimeout:
print('read timeout')
# send mail or notification.
timeout
不是整个响应下载的时间限制;相反,如果服务器在 timeout
秒内没有发出响应(更准确地说,如果在 timeout
秒内底层套接字上没有收到任何字节),则会引发异常。
响应状态码
- 正常响应
import requests
r = requests.get('https://httpbin.org/get')
print(r.status_code, r.status_code == requests.codes.ok) # 200 True
- 404响应
import requests
r = requests.get('https://httpbin.org/status/404')
print(r.status_code) # 404
r.raise_for_status # 获取错误详情,如果没有错误则返回 None
响应头
import requests
r = requests.get('https://httpbin.org/get')
print(r.headers) # 获取所有响应头
print(r.headers['Content-Type']) # 获取某个响应头
print(r.headers.get('content-type')) # 使用 get 方法获取具体响应头
响应头名不区分大小写
Cookies
获取响应 Cookies 在 httpbin.org 站点有一个
https://httpbin.org/cookies
的 GET 请求地址,响应中返回当前 cookies 值。我们可以借助它查看我们请求时设置的 cookies 是否成功。例如:pythonimport requests cookies = {"chocolate": "chip"} # 配置 cookies 字典 r = requests.get('https://httpbin.org/cookies', cookies=cookies) # 获取 cookies print(r.text)
使用
RequestsCookieJar
自定义Cookies 也可以通过requests.cookies.RequestsCookieJar()
初始化一个 Cookies,对 Cookies 进行精细化的配置。pythonjar = requests.cookies.RequestsCookieJar() jar.set('tasty_cookie', 'yum', domain='httpbin.org', path='/cookies') jar.set('gross_cookie', 'blech', domain='httpbin.org', path='/elsewhere') url = 'https://httpbin.org/cookies' r = requests.get(url, cookies=jar) print(r.text)
自定义 Cookies 请求 也可以通过上下问的
cookies
属性设置 Cookie,并将自定义的cookie
发送到服务器。pythonimport requests with requests.Session() as session: jar = session.cookies jar.set('tasty_cookie', 'yum', domain='httpbin.org', path='/cookies') jar.set('gross_cookie', 'blech', domain='httpbin.org', path='/elsewhere') # 不会被设置,因为访问的不是这个uri url = 'https://httpbin.org/cookies' r = requests.get(url, cookies=jar) print(r.text)
重定向和请求记录
通过 history
属性对响应对象的历史属性来跟踪重定向。
- 从http跳转到https
import requests
# 模拟 302 跳转
r = requests.get('https://httpbin.org/redirect-to', params={"url": "https://www.baidu.com"})
print(r.url) # https://www.baidu.com
print(r.status_code) # 200
print(r.history) # [<Response [302]>]
如果使用 GET
、OPTIONS
、POST
、PUT
、PATCH
或 DELETE
,可以使用 allow_redirects
参数禁用重定向处理:
import requests
r = requests.get('http://github.com/', allow_redirects=False)
print(r.status_code) # 301
print(r.history) # []
如果使用 HEAD
请求方法发起请求,也可以启用重定向:
import requests
r = requests.head('https://httpbin.org/redirect-to', params={"url": "https://www.baidu.com"}, allow_redirects=True)
print(r.url) # https://www.baidu.com
print(r.history) # [<Response [302]>]
请求上下文
使用请求上下文模拟登录后在发送请求的操作。
import requests
login_url = 'https://passport.17k.com/ck/user/login' # 登录URL
login_form_data = {
"loginName": "", # 输入登录用户名
"password": "mxb_DVZ9hyk.dhk3", # 输入密码
}
book_self_url = 'https://user.17k.com/ck/author/shelf?page=1&appKey=2406394919' # 登录后要操作的URL
with requests.Session() as session: # 记录请求当前会话
# 登录
login_response = session.post(url=login_url, data=login_form_data)
# print(login_response.cookies) # 查看服务器响应cookies
# 获取我的书架
book_self_response = session.get(book_self_url).json()
# 打印输出JSON响应结果
print(book_self_response)
跨请求共用配置
公用请求头
import requests
with requests.Session() as session:
headers = {'User-Agent': 'my-app/1.0.0'}
session.headers.update(headers)
r = session.get('https://httpbin.org/headers')
r2 = session.get('https://httpbin.org/get')
print(r.json()['headers']['User-Agent'], r2.json()['headers'].get('User-Agent'), sep="\n")