云栈社区»论坛 › 技术文档「 Note & Doc 」 › Python HTTP客户端库全面对比：从同步的requests到异步的aiohttp ...

发回帖发新帖

3264 积分	0 好友	423 主题

发消息

Python HTTP客户端库全面对比：从同步的requests到异步的aiohttp如何选型

发表于 2026-1-22 05:44:13 | 查看: 61| 回复: 0

还在用requests一把梭？小心别人说你只会用“Python中的瑞士军刀”，却不知还有“整个工具箱”！

1. 为什么需要多个HTTP请求库？

在 Python 的世界里，当我们想从网上获取数据时，总会遇到一个“甜蜜的烦恼”——选择太多了！

打开一个项目，你可能看到有人用 requests，有人用 urllib，还有人用 aiohttp。这不禁让人想问：Python 为什么需要这么多 HTTP 请求库？一个不够用吗？

不同的工具，承担着不同的使命。Python 生态中的多个 HTTP 客户端库，就像工具箱里的不同工具——每件都有其专属场景，缺一不可。

requests 凭借简洁优雅的 API 成为大众首选，适合绝大多数日常请求场景。而 aiohttp 专为异步高并发而生，能轻松应对每秒数千次的请求压力。httpx 融合了二者优势，既支持异步又兼容同步，还内置了 HTTP/2 支持。至于标准库 urllib，虽然 API 稍显繁琐，却是无需安装任何第三方依赖的“保底选择”。

正是这种丰富的生态多样性，让开发者能够根据具体的性能需求、协议要求和开发模式，灵活选用最合适的工具。从简单的数据抓取脚本到高性能的后端服务系统，总有一款恰到好处的 HTTP 客户端在你手边。

2. 同步请求库

在 Python 的 HTTP 请求领域，虽然异步已成趋势，但同步请求库依然是稳扎稳打的实力派。它们非常适合脚本、爬虫、API 调用等场景，尤其是当你不需要处理成千上万的并发连接时。

1. requests（主流首选）

Requests 是 Python 领域最经典、最优雅的 HTTP 客户端库。它秉承“人类可读”的 API 设计哲学，让原本复杂的网络请求操作变得如同日常对话般简单直观。

安装：

pip install requests

初阶用法：

import requests

# GET请求
response = requests.get('https://api.github.com')
print(response.status_code)
print(response.json())

# POST请求
data = {'key': 'value'}
response = requests.post('https://httpbin.org/post', data=data)

高阶心法：

# 会话保持（TCP连接复用）
session = requests.Session()
session.headers.update({'User-Agent': 'MyApp/1.0'})

# 超时控制
requests.get(url, timeout=(3.05, 27))  # 连接3.05s，读取27s

# 流式处理大文件
with requests.get(url, stream=True) as r:
    for chunk in r.iter_content(chunk_size=8192):
        process_chunk(chunk)

# 重试机制
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

retry_strategy = Retry(
    total=3,
    backoff_factor=1,
    status_forcelist=[429, 500, 502, 503, 504]
)
adapter = HTTPAdapter(max_retries=retry_strategy)
session.mount('https://', adapter)

2. httpx（现代化替代）

httpx 是 Python 中一个同时支持同步/异步请求、HTTP/1.1/HTTP/2 协议，且兼容 requests API 的现代化 HTTP 客户端库，堪称 requests 的加强版和 asyncio 时代的新选择。

安装：

pip install httpx

初阶用法：

import httpx

# 简单请求
response = httpx.get('https://httpbin.org/get')
print(response.text)

# 更现代的API
with httpx.Client() as client:
    response = client.post('https://httpbin.org/post', json={'key': 'value'})

高阶心法：

# HTTP/2支持（需安装httpx[http2]）
client = httpx.Client(http2=True)

# 同步/异步同API
import httpx

# 同步
response = httpx.get(url)

# 异步（同一套API）
async def fetch():
    async with httpx.AsyncClient() as client:
        return await client.get(url)

# 更细粒度的超时控制
timeout = httpx.Timeout(10.0, connect=5.0)
client = httpx.Client(timeout=timeout)

# 事件钩子
def log_request(request):
    print(f"Request: {request.method} {request.url}")

def log_response(response):
    print(f"Response: {response.status_code}")

client = httpx.Client(
    event_hooks={'request': [log_request], 'response': [log_response]}
)

3. urllib3（底层王者）

urllib3 是 Python 生态中坚如磐石的 HTTP 客户端库，以连接池、线程安全和企业级特性著称，堪称 Requests 等高级库的底层基石。

安装：

pip install urllib3

初阶用法：

import urllib3

http = urllib3.PoolManager()
response = http.request('GET', 'https://httpbin.org/robots.txt')
print(response.data.decode('utf-8'))

高阶心法：

# 连接池配置
http = urllib3.PoolManager(
    num_pools=10,          # 连接池数量
    maxsize=100,           # 每个池最大连接数
    block=True,            # 连接满时是否阻塞
    timeout=urllib3.Timeout(connect=2.0, read=7.0)
)

# 自定义重试策略
retries = urllib3.Retry(
    total=5,                # 总重试次数
    backoff_factor=0.5,     # 退避因子
    status_forcelist=[500, 502, 503, 504]
)

# 代理支持
proxy = urllib3.ProxyManager('http://proxy.example.com:3128/')

# 自定义证书验证
http = urllib3.PoolManager(
    cert_reqs='CERT_REQUIRED',
    ca_certs='/path/to/your/certfile.pem'
)

4. treq

treq 是一个建立在 Twisted 框架之上的 Python HTTP 客户端库。它提供了类似于 requests 的简洁 API，但天生支持异步和并发，是编写高性能、非阻塞网络应用的利器。

安装：

pip install treq

特点：基于 Twisted，API 类似 requests

import treq
from twisted.internet import reactor

def print_response(response):
    print(f"Status: {response.code}")
    return treq.text_content(response)

d = treq.get('https://httpbin.org/get')
d.addCallback(print_response)
d.addBoth(lambda _: reactor.stop())
reactor.run()

5. httplib2

httplib2 是一个特性丰富的 HTTP 客户端库。虽然它已逐渐被更现代的库取代，但它在缓存控制、重试机制和 HTTP 协议完整性方面依然有其独特价值。

安装：

pip install httplib2

特点：缓存支持，轻量级

import httplib2

http = httplib2.Http('.cache')  # 启用缓存
response, content = http.request('https://www.example.com')

3. 异步请求库

在当今高并发的互联网应用中，传统的同步请求模式已经难以满足性能需求。想象一下，你的爬虫需要同时抓取数百个页面，或者你的 API 网关需要并行处理数十个上游服务调用——这就是异步请求库大显身手的时刻。

1. aiohttp

aiohttp 是一个基于 asyncio 的异步 HTTP 客户端/服务器框架，能够轻松处理高并发网络请求，是构建高性能异步应用的利器。

安装与基础配置

pip install aiohttp
# 可选：安装加速组件
pip install aiohttp[speedups]  # 包含cchardet和aiodns

基础用法示例

import aiohttp
import asyncio

async def fetch_page(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            return await response.text()

async def main():
    urls = [
        'https://api.example.com/data1',
        'https://api.example.com/data2',
        'https://api.example.com/data3'
    ]

    # 并发执行所有请求
    tasks = [fetch_page(url) for url in urls]
    results = await asyncio.gather(*tasks)

    for url, content in zip(urls, results):
        print(f"{url}: {len(content)} bytes")

# Python 3.7+
asyncio.run(main())

高级特性

# 1. 连接池与超时控制
timeout = aiohttp.ClientTimeout(total=10, connect=3)
conn = aiohttp.TCPConnector(limit=100, limit_per_host=20)
session = aiohttp.ClientSession(
    connector=conn,
    timeout=timeout,
    headers={'User-Agent': 'MyApp/1.0'}
)

# 2. 会话级配置与自动重试
import aiohttp_retry
from aiohttp_retry import RetryClient, ExponentialRetry

retry_options = ExponentialRetry(attempts=5)
session = RetryClient(
    retry_options=retry_options,
    raise_for_status=False
)

# 3. WebSocket支持
async with session.ws_connect('ws://echo.websocket.org') as ws:
    await ws.send_str('Hello World!')
    async for msg in ws:
        if msg.type == aiohttp.WSMsgType.TEXT:
            print(msg.data)

2. httpx

是的，httpx 再次登场。它不仅是一个优秀的同步库，更是 asyncio 生态下的强大异步客户端。

安装指南

pip install httpx
# HTTP/2支持（可选但推荐）
pip install httpx[http2]

快速上手

import httpx
import asyncio

async def fetch_concurrently():
    async with httpx.AsyncClient(
        timeout=10.0,
        limits=httpx.Limits(max_connections=100)
    ) as client:
        # 单个请求
        response = await client.get('https://httpbin.org/get')
        print(response.json())

        # 批量请求
        tasks = [
            client.get(f'https://api.example.com/users/{i}')
            for i in range(100)
        ]
        responses = await asyncio.gather(*tasks)

        # 流式响应
        async with client.stream('GET', 'https://example.com/large-file') as response:
            async for chunk in response.aiter_bytes():
                process_chunk(chunk)

asyncio.run(fetch_concurrently())

高级玩法

# 1. 同步/异步双模式
# 异步客户端
async_client = httpx.AsyncClient()
# 同步客户端（保持requests API兼容）
sync_client = httpx.Client()

# 2. 自定义传输层
import httpx
from httpx._transports.asgi import ASGITransport

# 直接测试ASGI应用
async with httpx.AsyncClient(
    transport=ASGITransport(app=my_asgi_app),
    base_url="http://testserver"
) as client:
    response = await client.get("/api/data")

# 3. 响应验证与中间件
from httpx import AsyncClient, Response

async def log_request(request):
    print(f"Request: {request.method} {request.url}")

async def log_response(response):
    print(f"Response: {response.status_code}")
    return response

client = AsyncClient(event_hooks={
    'request': [log_request],
    'response': [log_response]
})

# 4. HTTP/2 多路复用
async with httpx.AsyncClient(http2=True) as client:
    # 所有请求将复用同一个HTTP/2连接
    await asyncio.gather(
        client.get('https://example.com/api1'),
        client.get('https://example.com/api2'),
        client.get('https://example.com/api3')
    )

3. grequests

grequests 是基于 gevent 的异步请求库，让你能用近乎 requests 的同步写法获得异步并发的高性能，堪称“懒人版 requests + asyncio”的优雅结合。

安装

pip install grequests

使用（基于gevent）

import grequests

urls = [
    'http://www.heroku.com',
    'http://python-tablib.org',
    'http://httpbin.org'
]

# 创建异步请求集
rs = (grequests.get(u) for u in urls)

# 并发执行
responses = grequests.map(rs, size=10)

for r in responses:
    print(r.status_code, r.url)

4. 特殊场景下的库

1. websockets

Websockets 是 Python 中专门用于处理 WebSocket 协议的库，能在客户端与服务器间建立全双工实时通信通道，专为需要低延迟双向数据交换的场景设计。

应用场景
实时聊天应用、股票行情推送、在线游戏、IoT 设备控制等需要全双工通信的场景。

安装

pip install websockets

初步使用

import asyncio
import websockets

# WebSocket客户端
async def websocket_client():
    async with websockets.connect('ws://localhost:8765') as websocket:
        # 发送消息
        await websocket.send("Hello, WebSocket!")
        # 接收消息
        response = await websocket.recv()
        print(f"Received: {response}")

asyncio.run(websocket_client())

2. grpcio

grpcio 是 Python 中基于 HTTP/2 和 Protocol Buffers 的高性能 RPC 框架，让微服务间的通信如本地调用般高效且类型安全。

适用场景
微服务间通信、需要强类型接口、高性能要求的内部服务调用。

安装

pip install grpcio grpcio-tools

初步使用
首先定义 protobuf 文件（example.proto）：

syntax = "proto3";

package example;

service Greeter {
  rpc SayHello (HelloRequest) returns (HelloReply) {}
}

message HelloRequest {
  string name = 1;
}

message HelloReply {
  string message = 1;
}

编译生成Python代码

python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. example.proto

客户端使用：

import grpc
import example_pb2
import example_pb2_grpc

def run():
    # 创建通道
    with grpc.insecure_channel('localhost:50051') as channel:
        stub = example_pb2_grpc.GreeterStub(channel)

        # 创建请求
        request = example_pb2.HelloRequest(name='World')

        # 发送请求
        response = stub.SayHello(request)
        print(f"收到响应: {response.message}")

if __name__ == '__main__':
    run()

3. 特殊协议请求：其他值得关注的库

a. 邮件协议：imaplib/smtplib

import imaplib
import email
from email.header import decode_header

# 读取邮件
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login('your_email@gmail.com', 'your_password')
mail.select('INBOX')

# 搜索邮件
status, messages = mail.search(None, 'UNSEEN')
for num in messages[0].split():
    status, msg_data = mail.fetch(num, '(RFC822)')
    email_message = email.message_from_bytes(msg_data[0][1])

    # 解码主题
    subject, encoding = decode_header(email_message['Subject'])[0]
    if isinstance(subject, bytes):
        subject = subject.decode(encoding if encoding else 'utf-8')

    print(f'主题: {subject}')

b. FTP/FTPS：ftplib

from ftplib import FTP_TLS

# 安全FTP连接
ftp = FTP_TLS('ftp.example.com')
ftp.login('user', 'pass')
ftp.prot_p()  # 切换到安全数据连接

# 上传文件
with open('local_file.txt', 'rb') as f:
    ftp.storbinary('STOR remote_file.txt', f)

# 下载文件
with open('local_copy.txt', 'wb') as f:
    ftp.retrbinary('RETR remote_file.txt', f.write)

ftp.quit()

c. SSH/SCP：paramiko

import paramiko

# SSH连接
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect('hostname', username='user', password='pass')

# 执行命令
stdin, stdout, stderr = ssh.exec_command('ls -la')
print(stdout.read().decode())

# SCP传输文件
sftp = ssh.open_sftp()
sftp.put('local_file.txt', 'remote_file.txt')
sftp.get('remote_file.txt', 'local_copy.txt')

ssh.close()

结语

在 Python 的网络编程生态中，库的选择非常丰富。从经典的同步请求到现代的异步处理，从通用的 HTTP 协议到各种特殊的通信协议，都有成熟的解决方案。

随着 Python 异步编程的日益成熟，aiohttp 和 httpx 等异步库已经成为处理高并发网络请求的首选。然而，requests 等同步库由于其极致的简单性和可靠性，在不需要高并发的日常场景下，依然拥有不可替代的地位。

作为开发者，我们应该根据实际的项目需求、性能指标和团队技术栈来选择合适的工具，并深入理解其原理和最佳实践。只有这样，才能编写出既高效又稳定的网络应用程序。

希望本文的介绍能帮助你在面对不同的网络编程需求时，做出更合适的技术选型，并快速上手使用。如果你想深入探讨更多 Python 或网络相关的技术话题，欢迎来云栈社区交流分享。

上一篇：Wireshark 4.6.3 维护版发布：增强网络协议分析与抓包稳定性
下一篇：SpringBoot集成OnlyOffice实现在线Word编辑与保存

Python, HTTP客户端, 网络编程, 异步编程, API开发

Python HTTP客户端库全面对比：从同步的requests到异步的aiohttp如何选型

1. 为什么需要多个HTTP请求库？

2. 同步请求库

1. requests（主流首选）

2. httpx（现代化替代）

3. urllib3（底层王者）

4. treq

5. httplib2

3. 异步请求库

1. aiohttp

2. httpx

3. grequests

4. 特殊场景下的库

1. websockets

2. grpcio

3. 特殊协议请求：其他值得关注的库

结语

相关帖子

浏览过的版块