Exploiting symmetries in reinforcement learning of bimanual robotic tasks
Movement primitives (MPs) have been widely adopted for representing and learning robotic movements using reinforcement learning policy search. Probabilistic movement primitives (ProMPs) are a kind of MP based on a stochastic representation over sets of trajectories, able to capture the variability allowed while executing a movement. This approach has proved effective in learning a wide range of robotic movements, but it comes with the necessity of dealing with a high-dimensional space of parameters. This may be a critical problem when learning tasks with two robotic manipulators, and this work proposes an approach to reduce the dimension of the parameter space based on the exploitation of symmetry. A symmetrization method for ProMPs is presented and used to represent two movements, employing a single ProMP for the first arm and a symmetry surface that maps that ProMP to the second arm. This symmetric representation is then adopted in reinforcement learning of bimanual tasks (from user-provided demonstrations), using relative entropy policy search algorithm. The symmetry-based approach developed has been tested in an experiment of cloth manipulation, showing a speed increment in learning the task. ; This work was partially developed in the context of the project CLOTHILDE (>CLOTH manIpulation Learning from DEmonstrations>), which has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (Advanced Grant agreement No 741930). This work is supported by the Spanish State Research Agency through the María de Maeztu Seal of Excellence to IRI (MDM-2016-0656).