Task01:简单图论与环境配置与PyG库
近期参加了开源组织Datawhale的线上组队学习活动,因为之前有看过一些图神经的理论知识,之后的课题方向也想向图神经靠拢,所以打算跟着Datawhale再重新走一遍图神经。然后期间每个任务都会总结(估计会有很多不到位的地方拉~如果有人会看到,请多多包涵哦!)
Datawhale提供的课程链接:https://github.com/datawhalechina/team-learning-nlp/blob/master/GNN
一,简单图论
1.图的表示
data:image/s3,"s3://crabby-images/83ddc/83ddc28c5f21f3292369ca6b60211b0626ee5981" alt=""
节点和边的信息可以是类别型的(categorical),类别型数据的取值只能是哪一类别。一般称类别型的信息为标签(label)。
节点和边的信息可以是数值型的(numeric),数值型数据的取值范围为实数。一般称数值型的信息为属性(attribute)。
data:image/s3,"s3://crabby-images/54233/542338f9f0ba0331d3e99cafcdf1eb4a1c51903f" alt=""
data:image/s3,"s3://crabby-images/77c90/77c90e88af0b06b8326a83f9859bee5bc673c735" alt=""
2.图的属性
2.1 结点的度
data:image/s3,"s3://crabby-images/25406/254067e98c43c77f5a25b88251f7836d681c9af9" alt=""
2.2 邻接结点(neighbors)
data:image/s3,"s3://crabby-images/b2bbc/b2bbc77210b83a2d2a082263592c276d050c18b2" alt=""
2.3 行走(walk)
data:image/s3,"s3://crabby-images/cfb20/cfb20fc8ab74dd5941f1b89797aa7d283f2af999" alt=""
data:image/s3,"s3://crabby-images/c64dc/c64dca690f300dfbfbc950008b9eb3b27aea2561" alt=""
2.4 路径(path)
路径是结点不可重复的行走。
2.5 子图(subgraph)
data:image/s3,"s3://crabby-images/2979f/2979fe2c377e1a52e6f75529f4ff54d2bfb2eba4" alt=""
2.6 连通分量(connected component)
data:image/s3,"s3://crabby-images/625ee/625ee7a454fed119c0d78e3d650c12f6f41e1cec" alt=""
2.7 连通图(connected graph)
当一个图只包含一个连通分量,即其自身,那么该图是一个连通图。
2.8 最短路径(shortest path)
data:image/s3,"s3://crabby-images/28e99/28e9952f250de5ea358fcaa53b7347eae60e907b" alt=""
2.9 直径(diameter)
data:image/s3,"s3://crabby-images/7b60a/7b60ae77db6804c106d0ee5562dcb1ba73671bb8" alt=""
2.10 拉普拉斯矩阵
data:image/s3,"s3://crabby-images/dbdd2/dbdd2b06d1b00971bbc8ccda769efd2f31db2b87" alt=""
对称归一化的拉普拉斯矩阵如下所示:
data:image/s3,"s3://crabby-images/ee9e8/ee9e8b66772cfdc6a949c00fd06096fdd18e630e" alt=""
3.图的基本类型
3.1 有向图和无向图
data:image/s3,"s3://crabby-images/d456a/d456a71283cb1731a6d976d781deb1a13ed3f570" alt=""
3.2 非加权图和加权图
data:image/s3,"s3://crabby-images/fc866/fc8661d3f73a4515ce9918f12b029d46bbfc8cd3" alt=""
3.3 连通图和非连通图
data:image/s3,"s3://crabby-images/154c4/154c462ccdc8f550b3d911ca126203978118e0f9" alt=""
3.4 二部图
data:image/s3,"s3://crabby-images/1c6ff/1c6ff6c0450f9233eef33fd9ab0cdc98fb4877c6" alt=""
3.5 同质图和异质图
同质图(Homogeneous Graph):只有一种类型的节点和一种类型的边的图
异质图(Heterogeneous Graph):存在多种类型的节点和多种类型的边的图。
二,环境配置
1.使用nvidia-smi命令查看服务器上GPU情况
data:image/s3,"s3://crabby-images/c0116/c0116c9245506a20f65f1502501fc0f15bcb5c17" alt=""
2.安装正确版本的pytorch和cudatoolkit,此处安装1.8.1版本的pytorch和11.1版本的cudatoolkit
在vscode上新建虚拟环境gnn_env_lj,用来安装正确版本的pytorch和cudatoolkit。
新建虚拟环境:conda create -n gnn_env_lj python=3.8.5,创建新的虚拟环境gnn_env_lj,并在里面安装了python3.8.5,然后conda activate gnn_env_lj,激活虚拟环境,在该虚拟环境里面安装相应版本的pytorch和cudatoolkit。
使用教程提供的conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c nvidia代码安装总是有几个包安装不上,如下图所示:
data:image/s3,"s3://crabby-images/2eedc/2eedcca72f1237a6e65c5b12e5f35e58e343f971" alt=""
data:image/s3,"s3://crabby-images/29c02/29c0205502527cf1755209b1e8d7e6e3a86415c0" alt=""
把装不上的四个包用pip install安装,发现只有不加版本号的时候才能装上,好奇怪!
data:image/s3,"s3://crabby-images/ff277/ff277bdcff1720e1884ab09b3e8265309826c184" alt=""
最后在pytorch官网上找到对应版本的pip安装pytorch代码:pip3 install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html,
pytorch官网网址:https://pytorch.org/
data:image/s3,"s3://crabby-images/b5638/b5638bafe75541577b55443f0e3e921d37a9aa19" alt=""
运行这段代码后可以成功安装pytorch:
data:image/s3,"s3://crabby-images/f4783/f47832e485820dc0a9dc8339a824d62a8ff765d6" alt=""
然后确认是否正确安装,以下结果表示安装正确。
data:image/s3,"s3://crabby-images/87ad7/87ad75f1108089cf410e9b6a35a9ee4b22806227" alt=""
3.安装正确版本的PyG
data:image/s3,"s3://crabby-images/53681/536811b2e99a10e7e017ad0f387ab2e8f925a925" alt=""
三.Data类——PyG中图的表示及其使用
1.PyG图数据的表示及其使用,即学习PyG中的Data类
data:image/s3,"s3://crabby-images/2c41d/2c41def2d5d7660c07255f08d274c350bb290db8" alt=""
data:image/s3,"s3://crabby-images/46d83/46d83b501a760b51c3779b21058878d6dd3eb887" alt=""
data:image/s3,"s3://crabby-images/2ed47/2ed4799e9b92a238504232da9b9feb315f41a8bd" alt=""
data:image/s3,"s3://crabby-images/0787d/0787dcf66f9652755ecfde409e08596f42929438" alt=""
2.PyG中图数据的表示及其使用,即学习PyG中Dataset类
PyG内置了大量常用的基准数据集,下面我们以PyG内置的Planetoid数据集为例,来学习PyG中图数据集的表示及使用
data:image/s3,"s3://crabby-images/907ff/907ff9673f3f8aa5338263bd4ddcf29adaa36144" alt=""
data:image/s3,"s3://crabby-images/42f1a/42f1a49633efb1b99cc229dfe042619453f9e45f" alt=""
展示一个简单的GCN模型构造和训练过程,没有用到Dataset和DataLoader,我们将使用一个简单的GCN层,并在Cora数据集上实验。
data:image/s3,"s3://crabby-images/859bf/859bfca50c5b504fc6a4a049669e01e1c1c3e2c8" alt=""
4.作业
data:image/s3,"s3://crabby-images/9eff6/9eff6879adaa12aa546a1e59adc7743aac2a99ae" alt=""
data:image/s3,"s3://crabby-images/2a628/2a628170431eba3d0deead51180f8d334daddf0a" alt=""