Subscribe

Categories

Videos

Learning to walk efficiently with passive compliance

The compliant humanoid robot COMAN learns to minimize the energy consumption during walking. This is the video I presented together with my paper at IROS 2011 in San Francisco, in September 2011.


We present a learning-based approach for minimizing the electric energy consumption during walking of a passively-compliant bipedal robot. The energy consumption is reduced by learning a varying-height center-of-mass trajectory which uses efficiently the robot’s passive compliance. To do this, we propose a reinforcement learning method which evolves the policy parameterization dynamically during the learning process and thus manages to find better policies faster than by using fixed parameterization. The method is first tested on a function approximation task, and then applied to the humanoid robot COMAN where it achieves significant energy reduction.

Link to publication:
Kormushev, P., Ugurlu, B., Calinon, S., Tsagarakis, N., and Caldwell, D.G., “Bipedal Walking Energy Minimization by Reinforcement Learning with Evolving Policy Parameterization“, IEEE/RSJ Intl Conf. on Intelligent Robots and Systems (IROS-2011), San Francisco, 2011. [pdf] [bibtex]


Humanoid robot learns to clean a whiteboard

A Japanese humanoid robot (Fujitsu HOAP-2) learns to clean a whiteboard by upper-body kinesthetic teaching during full-body balance control. The research is from an Italian-Japanese collaboration between the Italian Institute of Technology and Tokyo City University.

We present an integrated approach allowing a free-standing humanoid robot to acquire new motor skills by kinesthetic teaching. The proposed full-body control method controls simultaneously the upper and lower body of the robot with different control strategies. Imitation learning is used for training the upper body of the humanoid robot via kinesthetic teaching, while at the same time Reaction Null Space method is used for keeping the balance of the robot. During demonstration, a force/torque sensor is used to record the exerted forces, and during reproduction, we use a hybrid position/force controller to apply the learned trajectories in terms of positions and forces to the end effector. The proposed method is tested on a 25-DOF Fujitsu HOAP-2 humanoid robot with a surface cleaning task.

This research will be presented at the International Conference on Robotics and Automation (ICRA) in May 2011 in Shanghai, China.

Authors:
Dr. Petar Kormushev (IIT)
Prof. Dragomir N. Nenchev (TCU)
Dr. Sylvain Calinon (IIT)
Prof. Darwin G. Caldwell (IIT)

Affiliations:
IIT – Italian Institute of Technology, Advanced Robotics Dept.
TCU – Tokyo City University, Mechanical Systems Engineering Dept.

Photos (high-res):
http://www.flickr.com/photos/petar_kormushev/sets/

Link to publication:
Kormushev, P., Nenchev, D.N., Calinon, S., and Caldwell, D.G., ”Upper-body Kinesthetic Teaching of a Free-standing Humanoid Robot“, IEEE Intl. Conf. on Robotics and Automation (ICRA 2011), 2011. [pdf] [bibtex]


Robot Archer iCub

Humanoid robot iCub learns the skill of archery. After being instructed how to hold the bow and release the arrow, the robot learns by itself to aim and shoot arrows at the target. It learns to hit the center of the target in only 8 trials.

The learning algorithm, called ARCHER (Augmented Reward Chained Regression) algorithm, was developed and optimized specifically for problems like the archery training, which have a smooth solution space and prior knowledge about the goal to be achieved. In the case of archery, we know that hitting the center corresponds to the maximum reward we can get. Using this prior information about the task, we can view the position of the arrow’s tip as an augmented reward. ARCHER uses a chained local regression process that iteratively estimates new policy parameters which have a greater probability of leading to the achievement of the goal of the task, based on the experience so far. An advantage of ARCHER over other learning algorithms is that it makes use of richer feedback information about the result of a rollout.

For the archery training, the ARCHER algorithm is used to modulate and coordinate the motion of the two hands, while an inverse kinematics controller is used for the motion of the arms. After every rollout, the image processing part recognizes automatically where the arrow hits the target which is then sent as feedback to the ARCHER algorithm. The image recognition is based on Gaussian Mixture Models for color-based detection of the target and the arrow’s tip.

The experiments are performed on a 53-DOF humanoid robot iCub. The distance between the robot and the target is 3.5m, and the height of the robot is 104cm.
This research will be presented at the Humanoids 2010 conference in December 2010 in USA.

Authors:
Dr. Petar Kormushev
Dr. Sylvain Calinon
Dr. Ryo Saegusa
Prof. Giorgio Metta

Photos of Robot Archer iCub:
http://www.flickr.com/photos/petar_kormushev/sets/

NEW! High-resolution photos here: http://bit.ly/​boCmVi

Link to publication:
Kormushev, P., Calinon, S., Saegusa, R. and Metta, G., “Learning the skill of archery by a humanoid robot iCub”, Proc. IEEE Intl Conf. on Humanoid Robots (Humanoids-2010), pp. 417-423, 2010.


Robot learns to flip pancakes

Teaching a Barrett WAM robot to flip pancakes:

The video shows a Barrett WAM 7 DOFs manipulator learning to flip pancakes by reinforcement learning.

The motion is encoded in a mixture of basis force fields through an extension of Dynamic Movement Primitives (DMP) that represents the synergies across the different variables through stiffness matrices. An Inverse Dynamics controller with variable stiffness is used for reproduction.

The skill is first demonstrated via kinesthetic teaching, and then refined by Policy learning by Weighting Exploration with the Returns (PoWER) algorithm. After 50 trials, the robot learns that the first part of the task requires a stiff behavior to throw the pancake in the air, while the second part requires the hand to be compliant in order to catch the pancake without having it bounced off the pan.

Authors:
Dr. Petar Kormushev
Dr. Sylvain Calinon
Prof. Darwin G. Caldwell
Advanced Robotics Dept., Italian Institute of Technology

Link to publication:
Kormushev, P., Calinon, S. and Caldwell, D.G. “Robot Motor Skill Coordination with EM-based Reinforcement Learning”, Proc. IEEE/RSJ Intl Conf. on Intelligent Robots and Systems (IROS-2010), 2010.

Link to another publication on PoWER algorithm by Jens Kober and Jan Peters