ERL:ES-NES-CMA-ES-GA

2022-07-31  本文已影响0人  臻甄

进化策略 Evolutionary Strategy (ES)

遗传算法 Genetic Algirithm (GA)

自然进化策略 Natural Evolution Strategy (NES)

交叉熵方法 Cross Entropy Method(CEM)

参考多元高斯分布中协方差矩阵的工作原理

协方差自适应进化策略 Covariance Matrix Adaptation Evolution Strategy(CMA-ES)


详细介绍CMA-ES

ES+RL

直接用ES优化策略网络

Evolution Strategies as a Scalable Alternative to Reinforcement Learning(OpenAI, arXiv2017)

Motivation:

Method:

Result:

直接用GA优化

Deep Neuroevolution: Genetic Algorithms are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning (UberAI, arXiv2017)

Motivation:

Method

Result

NS-ES

Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents(NIPS2018)

Motivation

Method:包括3种变体

Result

ERL

Evolution-Guided Policy Gradient in Reinforcement Learning(NeuIPS2018)

Motivation

Method

Result

GEP-PG

GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms(ICML2018)

Motivation:

Method:

CEM-RL

Combining evolutionary and gradient-based methods for policy search(ICLR2019)

Motivation:

Method:

Result

CERL

Collaborative Evolutionary Reinforcement Learning(ICML2019)

Motivation

Method

Result

PDERL

Proximal Distilled Evolutionary Reinforcement Learning(ICML2019)

Motivation:

Method

Result


SuperRL

Genetic Soft Updates for Policy Evolution in Deep Reinforcement Learning (ICLR 2021)

Motivation:

Method

Result

小总结

ES的优点:
1)鲁棒性强
2)对稀疏,延迟奖励不敏感
3)高度并行化并且可以具有较低的通信开销
4)可以融合现有的不可微模块

ES的缺点:
1)参数高维优化困难
2)样本利用效率低下

RL的缺点:
1)对参数敏感
2)对稀疏,延迟奖励敏感
3)探索能力不足

RL的优点:
1)off-policy样本利用率高

上一篇 下一篇

猜你喜欢

热点阅读