Robotic Cooking — Update

Cooking is a big part of our daily life.  It requires skills that take years of training and practice to master.  Preparing a fine meal also takes time.  Many of us spend several hours a day preparing meals for ourselves and our families.  But it is worth the effort since the quality of the meals directly affects our physical and mental health. 

If robots could reliably cook high-quality meals in home kitchens, it would dramatically improve the quality of our life.  People who own a cooking robot will enjoy meals that are as good as the dishes from a Michelin 3-star restaurant at home without hiring an expensive chef or spending years in training.

To achieve this, RPAL has analyzed and identified several technical challenges in building a robotic cooking system and developed a roadmap to solve those challenges.  The figure below illustrated the roadmap.

The roadmap has the following main components that we are currently working on:

  • Learning and representing cooking knowledge for robots.  It includes cooking knowledge extraction from recipes, pictures, and videos, knowledge graph representation, motion embedding, and graph search retrieval.
  • Learning and developing skills for robots to use kitchen tools and utensils.  It includes task-oriented grasp planning and self-supervised manipulation skill learning and generalization. 
  • Observing and understanding ingredient state change in the cooking process.  It includes their shapes, sizes, texture, color, firmness, viscosity, etc.

Learning and representing cooking knowledge for robots

Robots need to gain an understanding of cooking.  Our approach is to develop a learning approach that extracts cooking knowledge from recipes and cooking instructional videos.  The obtained knowledge is then represented in a functional object-oriented network (FOON).  We develop new tools for video understanding, knowledge representation, knowledge embedding, knowledge retrieval in the areas of deep learning, computer vision, and robotics.  For more details, please read more here. 

The following figure shows the current cooking knowledge FOON learned from 65 cooking instructional videos.

A zoomed-in view of the FOON is in the following figure.

* Selected publications for more details:

  1. Babaeian, J.A., Paulius, D. and Sun, Y. (2019) Long Activity Video Understanding using Functional Object-Oriented Network, IEEE Transactions on Multimedia, 21(7): 1813-1824.
  2. Paulius, D., Huang, Y., Melancon, J., Sun, Y. (2019) Manipulation Motion Analysis & Taxonomy in Cooking, International Conference on Intelligent Robots (IROS), 1-6.
  3. Paulius, D., Jelodar, B., and Sun, Y. (2018) Functional Object-Oriented Network: Construction & Expansion, ICRA 2018, pp 5935-5941.
  4. Paulius, D. Huang, Y., Milton, R., Buchanan, W.D., Sam J., and Sun, Y. (2016) Functional Object-Oriented Network for Manipulation Learning, IROS, 3655-3662. 
  5. Sun, Y., Ren, S., Lin, Y., (2014) Object-Object Interaction Affordance Learning, Robotics and Autonomous Systems, 62(4): 487-496.

* Supported by NSF, “RI: Small: Functional Object-Oriented Network for Manipulation Learning,” $398,529, 8/15/2014-7/31/2019

Learning and developing skills for robots to use kitchen tools and utensils

For a cooking task, a robot would need to grasp utensils and manipulate it properly. For example, to a perfect dough, a robot would need to grasp and hold a cup of flour and pour it into a bowl, and grasp a cup of water and pour it into the bowl as well.  Then the robot would grasp a whisk to mix them well.  Grasping the cups and whisk requires grasp planning so that they are properly grasped and do not fall off during the task.  Pouring water and flour should be well planned and controlled for the correction portion and ratio.  Mixing motion should dynamically adapt to the state of the mixture and lumps.  For grasping utensils and tools, our approach is to develop a task-oriented grasp planning that takes the desired force and torque in the manipulation task into consideration.   For more details, please read more here.  For manipulation, our approach is to develop a self-supervised learning and generalization approach that learns from human demonstrations and the actual outcomes.  Then it generalizes the skill to a variety of conditions through self-supervised practicing.  For more details, please read more here. 

The following figures show our grasp planning results that learn and mimic human thumb placements, which indicates the task requirements.

* Selected publications for more details:

  1. Lin, Y. and Sun, Y., (2015) Grasp Planning to Maximize Task Coverage, Intl. Journal of Robotics Research (IJRR), 34(9): 1195-1210.
  2. Lin, Y., and Sun, Y. (2015) Robot Grasp Planning Based on Demonstrated Grasp Strategies, Intl. Journal of Robotics Research (IJRR), 34(1): 26-42.
  3. Lin, Y. and Sun, Y. (2015) Task-Based Grasp Quality Measures for Grasp Synthesis, IROS, 485-490.
  4. Lin, Y., Sun, Y. (2014) Grasp Planning Based on Grasp Strategy Extraction from Demonstration, IROS, pp. 4458-4463.
  5. Lin, Y., Sun, Y. (2013) Task-Oriented Grasp Planning Based on Disturbance Distribution, ISRR, pp 1-16.
  6. Dai, W., Sun, Y., Qian, X., (2013) Functional Analysis of Grasping Motion, IROS, pp. 3507-3513.
  7. Lin Y., Sun Y. (2013) Grasp Mapping Using Locality Preserving Projections and KNN Regression, IEEE Intl. Conference on Robotics and Automation (ICRA), pp. 1068-1073.
  8. Lin Y., Ren S., Clevenger M., and Sun Y. (2012) Learning Grasping Force from Demonstration, IEEE Intl. Conference on Robotics and Automation (ICRA), pp. 1526-1531.

The following figure shows an example recording frame of a human cooking manipulation demonstration.

The left of the following figures shows a robot performs accurate water pouring using a bottle that is not in the learning set.  The middle figure shows the cups used in training to learn pouring skills.  The right figure shows the cups used in the evaluation of the learned pouring skill.  The evaluation results show the robot pouring skill is on par with the human pouring expertise in terms of accuracy and speed. 

* Selected publications for more details:

  1. Huang, Y. and Sun, Y. (2019) A Dataset of Daily Interactive Manipulation, International Journal of Robotics Research (IJRR), 38(8): 879-886.
  2. Chen, T., Huang, Y., Sun, Y. (2019) Accurate Pouring using Model Predictive Control Enabled by Recurrent Neural Network, IROS, 1-7.
  3. Huang, Y. and Sun, Y. (2017). Learning to Pour, IROS, pp 7005-7010.
  4. Huang, Y. and Sun, Y. (2015) Generating Manipulation Trajectories Using Motion Harmonics, IROS, 4949-4954.

* Supported by

NSF, “RI: Small: Generalizing Learned Manipulation Skills to Unseen Situations by Balancing Uncertainties,” $334,823, 9/1/2019/-8/31/2022.

NSF, “NRI: EAGER: Characterizing Physical Interaction in Instrument Manipulations,” $299,887, 3/1/2016-2/28/2019

Observing and understanding ingredient state change in the cooking process

The states of the ingredients are critical throughout the cooking process.  When the ingredients are raw, their shapes, sizes, texture, color, firmness will decide if they need to be trimmed, washed, or even thrown away.  Cooking without knowing their initial states is prone to food poisoning.  When processing the ingredient, their shapes, size, texture, color, firmness should be continuously watched and used to judge the readiness of the cooking processes and the necessity of an adjustment.  For example, in mixing, if a wrong ratio is detected during mixing, more flour or water should be added. 

The following figures show ingredients in their different states.

* Selected publications for more details:

  1. Jelodar, B., and Sun, Y. (2019) Joint Object and State Recognition Using Language Knowledge, IEEE ICIP 2019, pp. 3352-3356.

I will update our work and progress in building a robotic cooking system in the following sections: Robotic Cooking Setup, Cooking Knowledge for Robot, Grasping Utensils for Cooking, Cooking Manipulation Skills, and Perception for Cooking.

Leave a Reply

Your email address will not be published. Required fields are marked *