论文阅读：Graph Convolutional Reinfor

2021-01-05 本文已影响0人 syat_e6da

这篇论文主要介绍了DGN的算法，在DQN的基础上加了图网络，用于状态的融合。在多智能体环境下运用。relation kernel用的是self-attention。

论文算法框架

这篇论文提到的几个点：

因为智能体之间的关系变化太快了，所以图动态变化太快，不利于收敛，所以在连续2个时间点保持图暂时不变。
unlike other methods with parameter-sharing, e.g., DQN, that sample experiences from individual agents, DGN samples experiences based on the graph of agents, not individual agents, and thus takes into con- sideration the interactions between agents.（这个没太看懂，怎么根据图来sample呢？）
Temporal Relation Regularization.

这篇论文和论文：Deep Reinforcement Learning with Relational Inductive Biases. 都用到了图网络和强化学习的结合，都提到了relational reinforcement learning 这个概念。有机会可以了解一下。