Reinforcement Learning for Bimanual Throwing

Written on October 7th, 2021 by Geoffrey Clark

I applied reinforcement learning to the task of bimanual manipulation of thrown objects. Using sparse latent space policy search I learned complex bimanual policies capable of picking up and throwing soft balls into hoops ranging from 0.5m to 2.5m from the robot. Bimanual control policies are difficult to learn using reinforcement learning because if a single one of the 12 degrees of freedom is out of sync with the others, there is little to no reward. We overcame this problem by projecting the robot’s control actions (dynamic movement primitives) into a latent space that we can easily optimize even with sparse rewards. While competing solutions required thousands to learn a useful policy, the method I applied achieved success in less than 300 trials, taking only an hour. All of the learning was done on the physical robot, while we utilized a Microsoft Kinnect to track the ball in real time and determine if the throw was a success.