Pytorch袖珍手册之二

2021-08-17 本文已影响0人深思海数_willschang

pytorch pocket reference
原书下载地址：
我用阿里云盘分享了「OReilly.PyTorch.Pocket.R...odels.149209000X.pdf」，你可以不限速下载🚀
复制这段内容打开「阿里云盘」App 即可获取
链接：https://www.aliyundrive.com/s/NZvnGbTYr6C

第一章 Pytorch简介

从近几年来，Pytorch越来越受到研究者或工程应用人群的喜爱。

在本章节里，作者简要介绍什么是Pytorch及为什么它能受到如此广泛的使用（make it popular）。同时在本章节里也将介绍使用在云端或本地电脑安装使用Pytorch。最终我们将学会如何验证Pytorch是否正确安装和运行一个简单的Pytorch应用程序。

Pytorch是Facebook的人工智能研究实验室（FAIR）免费开源出来的人工智能框架，至今已有1700多名贡献者。

易于数组结构类型的运算
创建动态神经网络
基于GPU加速的自动求解微分
简单灵活的接口，便于实验测试（加载数据，调用数据转换，创建模型，同时可以自己很容易构建训练，验证，测试模式）
生态圈的活跃度也高，如高校或知名大企业
正是基于一些特性，Pytorch的使用者越来越多，他们当中有的人用于张量计算加速，有的则是用于深度学习开发。

Many developers and researchers use PyTorch to accelerate deep learning research experimentation and prototyping. Its simple Python API, GPU support, and flexibility make it a popular choice among academic and commercial research organizations.

随着使用人群及场景不断丰富，Pytorch的更新迭代也是很快，同时也迎合了不同设备的使用需求（云服务器和移动终端平台）。

一个好的AI框架能为我们在做人工智能应用研究带来诸多便利：加载数据，预处理，模型设计，训练及部署等环节。

A deep learning framework makes it easy to perform common tasks such data loading, preprocessing, model design, training, and deployment.

Pytorch几大优势

不管是在高校圈还是在工程领域，Pytorch都是十分受欢迎的
Pytorch支持市面上大部分主流的云平台（AWS，GCP，Azure，阿里云）。
Pytorch支持谷歌的Colab及Kaggle比赛
Pytorch成熟及稳定（现在版本1.8，最新应该是为1.9）
Pytorch支持CPU，GPU，TPU及并行处理
Pytorch支持分布式训练
Pytorch应用容易部署到云端服务器（TorchScript, TorchServe）
Pytorch开始支持移动端设备部署应用（安卓，iOS）
Pytorch有较好的生态圈及相关补充优化包（NLP，CV）
Pytorch支持C++前端接口
Pytorch支持ONNX（Open Neural Network Exchange）
Pytorch社区活跃度高，用户众多

Pytorch安装与使用

该部分现在网上资源很多，在这笔记里我就不多记录了

主要是介绍了Pytorch在Google Colaboratory的使用，在云服务器的安装使用及本地电脑安装使用。
详细可参阅官方安装页面引导。
https://pytorch.org/get-started/locally/

import torch

# 查看版本号
print(torch.__version__)
# 参看是否支持GPU加速
print(torch.cuda.is_available())

一个有趣的例子

书中是直接通过urllib包对网上的一个张图片进行下载，鉴于国内网络问题，这里直接下载保存下来作为后续样例使用。

样例图片下载地址：https://www.aliyundrive.com/s/LqmvEpYgWMy

%matplotlib inline
import matplotlib.pyplot as plt
from PIL import Image

fpath = 'coffee.jpg'

img = Image.open(fpath)
plt.imshow(img)

# SIZE
print(img.size)
# CHANNELS  ('R', 'G', 'B')
print(img.getbands())
# RGB
print(img.mode)

coffee.png

从上面程序中我们可以得到该图片的尺寸是（1107 * 827），接下来我们通过Pytorch来对其进行一些图片预处理操作并转换为合适的格式，使其符合Pytorch的神经网络模型的数据类型需求（tensor）。

import torch
from torchvision import transforms

transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225]
        )
])
# 将PIL Image转换为 tensor
img_tensor = transform(img)
print(type(img_tensor), img_tensor.shape)
# <class 'torch.Tensor'> torch.Size([3, 224, 224])

# 将tensor转换为PIL Image
# image = img_tensor.cpu().clone()
# image = image.squeeze(0) # 压缩一维
image = transforms.ToPILImage()(img_tensor) # 自动转换为0-255

plt.imshow(image)

transform

对比一下transform前后图片效果：

plt.figure(figsize=(8, 8)) 
plt.subplot(1, 2, 1)
plt.imshow(img)
plt.subplot(1, 2, 2)
plt.imshow(image)
plt.tight_layout()
plt.show()

after transform

在上面的例子中我们发现是通过Compose()方法对图片进行转换，且在这个过程中定义了一系列的转换器（a series of transforms），如resize，crop，totensor及归一化处理。

Normalizing the image improves the accuracy of the classifier.

基于已训练好的模型（AlexNet）进行图片分类预测

我们知道高效的机器学习处理过程，基本上都是基于批量（batches），所以对于我们刚加载进来的单张图片数据结构，需要做一些“加批量”维度处理。
在pytorch中我们调用unsqueeze()方法即可对单张图片数据增加一个维度作为批量值，即整个数据结构变为[1, 3, 224, 224]，其中1表示批量值为1，3为RGB三个通道，224*224为图片输入尺寸大小。

样例中ImageNet的labels数据文件下载地址：https://www.aliyundrive.com/s/NoLcfzUeXLv

batch = img_tensor.unsqueeze(0)
print(batch.shape)
# torch.Size([1, 3, 224, 224])

接下来就是调用AlexNet模型对图片进行推理预测。
AlexNet作为经典的分类模型，这里也不做过多介绍，大家可以自己百度了解相关内容。

%matplotlib inline
import matplotlib.pyplot as plt
from PIL import Image
import torch
from torchvision import transforms, models

# 定义device，便于后面程序使用
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# 图片地址及imagenet的类别文件地址
image_path = 'coffee.jpg'
imagenet_labels_path = 'imagenet_class_labels.txt'

# 读取图片
img = Image.open(image_path)
# 显示图片
# plt.imshow(img)

# 定义转换器
transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225]
        )
])

# 转换为tensor
img_tensor = transform(img)

# 将tensor转换为PIL Image格式
image_trans = transforms.ToPILImage()(img_tensor) # 自动转换为0-255

# plt.imshow(image_trans)
# transform前后图片对比
plt.figure(figsize=(8, 8)) 
plt.subplot(1, 2, 1)
plt.imshow(img)
plt.subplot(1, 2, 2)
plt.imshow(image_trans)
plt.tight_layout()
plt.show()

# 构建batch形式的单张图片结构[batch-size, channels, width, height]
batch = img_tensor.unsqueeze(0)
print(batch.shape) 
# torch.Size([1, 3, 224, 224])

# 加载alexnet模型，pretrained=True，表示带训练好的参数模型
model = models.alexnet(pretrained=True)

# eval()表示模型仅用于推理或预测，并不对相关参数进行更新操作。
model.eval()
model.to(device)
y = model(batch.to(device))
# print(y.shape)
# torch.Size([1, 1000])

y_max, index = torch.max(y, 1)
# print(index, y_max)
# tensor([967], device='cuda:0') tensor([21.9117], device='cuda:0', grad_fn=<MaxBackward0>)
# 与分类名及索引对应起来
with open('imagenet_class_labels.txt') as f:
    classes = [line.strip() for line in f.readlines()]
# print(classes[967])
# 967: 'espresso'

prob = torch.nn.functional.softmax(y, dim=1)[0] * 100
print(classes[index[0]], prob[index[0]].item())
# 967: 'espresso', 86.61658477783203

# 查看前面几个分值
_, indices = torch.sort(y, descending=True)
print('查看前面几个分值及类别：')
for idx in indices[0][:5]:
    print(classes[idx], prob[idx].item())

prediction

从上图中，我们可以看出预测的结果还是比较可信的（espresso，浓缩咖啡）。