强化学习每日更新

2018-07-03 强化学习

2018-07-03  本文已影响3人  松山剑客
  1. Learning Multi-Step Robotic Tasks from Observation [1]
    文章地址
    Due to burdensome data requirements, learning from demonstration often falls short of its promise to allow users to quickly and naturally program robots. Demonstrations are inherently ambiguous and incomplete, making a correct generalization to unseen situations difficult without a large number of demonstrations in varying conditions. By contrast, humans are often able to learn complex tasks from a single demonstration (typically observations without action labels) by leveraging context learned over a lifetime. Inspired by this capability, we aim to enable robots to perform one-shot learning of multi-step tasks from observation by leveraging auxiliary video data as context. Our primary contribution is a novel action localization algorithm that identifies clips of activities in auxiliary videos that match the activities in a user-segmented demonstration, providing additional examples of each. While this auxiliary video data could be used in multiple ways for learning, we focus on an inverse reinforcement learning setting. We empirically show that across several tasks, robots can learn multi-step tasks more effectively from videos with localized actions, compared to unsegmented videos.
    由于数据需求繁重,从演示中学习往往像其宣称的那样满足其允许用户快速自然地编写机器人程序。演示本质上是模糊的和不完整的,在没有大量的演示的情况下,对未经历过的情况进行泛化是很困难的。相比之下,通过利用一生中所学到的上下文,人类通常能够从单个演示(通常是没有动作标签的观察)中学习复杂的任务。受到这种能力的启发,我们的目标是使机器人能够通过利用辅助视频数据作为上下文来完成从观察到的多步骤任务的单次学习。我们的主要贡献是一种新的动作本地化算法,它可以识别辅助视频中与用户分段演示中活动相匹配的活动的片段,并提供每个示例的示例。虽然这个辅助视频数据可以以多种方式用于学习,但我们关注的是反强化学习。我们的经验表明,在几个任务中,相比未分段的视频,机器人可以更有效地从带有局部动作的视频中学习多步任务。

  1. 作者:爱可可_爱生活,链接:https://www.jianshu.com/p/5e3f77712422,來源:简书,

上一篇下一篇

猜你喜欢

热点阅读