当前位置：首页 > Python > 正文内容

[Python 教程] Python 网络请求与爬虫基础

admin2个月前 (03-18)Python80

Python 网络请求与爬虫基础

requests 是 Python 最常用的 HTTP 库。本文介绍网络请求和爬虫的基础知识。

一、基础请求

import requests

# GET 请求
response = requests.get('https://api.example.com/data')
print(response.status_code)  # 状态码
print(response.text)         # 响应文本
print(response.json())       # JSON 响应

# 带参数
params = {'q': 'python', 'page': 1}
response = requests.get('https://api.example.com/search', params=params)

# POST 请求
data = {'username': 'user', 'password': 'pass'}
response = requests.post('https://api.example.com/login', data=data)

二、请求头

headers = {
    'User-Agent': 'Mozilla/5.0',
    'Accept': 'application/json',
    'Authorization': 'Bearer token123'
}

response = requests.get('https://api.example.com/data', headers=headers)

三、会话管理

session = requests.Session()

# 保持会话（自动处理 cookies）
session.get('https://example.com/login', data={'user': 'admin'})
response = session.get('https://example.com/profile')

# 使用上下文
with requests.Session() as session:
    session.get('https://example.com')

四、文件上传

# 上传文件
files = {'file': open('report.xls', 'rb')}
response = requests.post('https://example.com/upload', files=files)

# 下载文件
response = requests.get('https://example.com/image.jpg')
with open('image.jpg', 'wb') as f:
    f.write(response.content)

五、异常处理

try:
    response = requests.get('https://api.example.com', timeout=5)
    response.raise_for_status()  # 检查状态码
except requests.exceptions.Timeout:
    print('请求超时')
except requests.exceptions.HTTPError as e:
    print(f'HTTP 错误：{e}')
except requests.exceptions.RequestException as e:
    print(f'请求异常：{e}')

六、简单爬虫示例

from bs4 import BeautifulSoup

def crawl_page(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    
    # 提取所有链接
    links = soup.find_all('a')
    for link in links:
        href = link.get('href')
        text = link.get_text(strip=True)
        print(f'{text}: {href}')

crawl_page('https://example.com')

标签: Python 爬虫 requests 网络编程

返回列表

上一篇：[Python 教程] Python 多线程编程指南

下一篇：Python 装饰器实用技巧：从入门到精通

Python 上下文管理器的 5 个实用技巧，让你的代码更优雅

在 Python 编程中，上下文管理器（Context Manager）是一个优雅的资源管理工具。你可能已经熟悉最常见的用法——使用 with 语句打开文件，但上下文管理器的能力远不止于此。今天，我将...

Python 装饰器的 5 个实用场景：从入门到精通

装饰器（Decorator）是 Python 中的"函数包装器"，它允许我们在不修改原函数代码的前提下，动态地添加功能。很多初学者学完 @decorator 语法后就止步不前，但实际上装饰器在实际工程...

Python 装饰器的 5 个实用技巧，让你的代码更优雅

在 Python 编程中，装饰器（Decorator）是一个强大而优雅的工具。很多初学者对装饰器的理解停留在@staticmethod 或@classmethod 这类内置装饰器上，但实际上，自定义装...

Python 中利用 functools.lru_cache 实现高效缓存：从入门到进阶

Python 中利用 functools.lru_cache 实现高效缓存：从入门到进阶在日常 Python 开发中，我们经常会遇到重复计算相同输入的问题，比如递归计算斐波那契数列、多次调用相同参...

Python 异步编程实战：从入门到精通

在 Python 开发中，我们经常会遇到需要同时处理多个 I/O 操作的场景。比如同时向多个 API 发送请求、批量下载文件、或者处理实时数据流。传统的同步方式会阻塞主线程，导致性能瓶颈。而异步编程通...

Python 上下文管理器的实战应用与原理深度解析

Python 上下文管理器的实战应用与原理深度解析概述上下文管理器是 Python 中一个优雅而强大的特性，通过 with 语句实现资源的自动管理。本文将从原理到实践，深入讲解如何创建自定义上下...

[Python 教程] Python 网络请求与爬虫基础

Python 网络请求与爬虫基础

一、基础请求

二、请求头

三、会话管理

四、文件上传

五、异常处理

六、简单爬虫示例

相关文章

Python 上下文管理器的 5 个实用技巧，让你的代码更优雅

Python 装饰器的 5 个实用场景：从入门到精通

Python 装饰器的 5 个实用技巧，让你的代码更优雅

Python 中利用 functools.lru_cache 实现高效缓存：从入门到进阶

Python 异步编程实战：从入门到精通

Python 上下文管理器的实战应用与原理深度解析

发表评论

Copyright Duuu.net Duuu笔记. Some Rights Reserved.

Powered By Z-BlogPHP. Theme by Duuu笔记.

[Python 教程] Python 网络请求与爬虫基础

Python 网络请求与爬虫基础

一、基础请求

二、请求头

三、会话管理

四、文件上传

五、异常处理

六、简单爬虫示例

相关文章

Python 上下文管理器的 5 个实用技巧，让你的代码更优雅

Python 装饰器的 5 个实用场景：从入门到精通

Python 装饰器的 5 个实用技巧，让你的代码更优雅

Python 中利用 functools.lru_cache 实现高效缓存：从入门到进阶

Python 异步编程实战：从入门到精通

Python 上下文管理器的实战应用与原理深度解析

发表评论取消回复

Copyright Duuu.net Duuu笔记. Some Rights Reserved.

Powered By Z-BlogPHP. Theme by Duuu笔记.

发表评论