Workshop on Nonverbal Cues for Human-Robot Cooperative Intelligence

Room No. 11, Abu Dhabi National Exhibition Centre (ADNEC), UAE

14th October

09:00 to 13:00

About the Workshop

This workshop is dedicated to discussing computational methods for sensing and recognition of nonverbal cues and internal states in the wild to realize cooperative intelligence between humans and intelligent systems. We gather researchers from different expertise, yet having the common goal, motivation, and resolve to explore and tackle this delicate issue considering the practicality of industrial applications. We are calling for papers to discuss novel methods to realize human-robot cooperative intelligence by sensing and understanding humans’ behavior, internal states, and to generate empathetic interactions.

Human internal state inference, e.g., cognitive, emotional, intention models.
Recognition of nonverbal cues, e.g., gaze and attention, body language, para-language.
Multi-modal sensing fusion for scene perception.
Nonverbal behavior generation for robots/agents, e.g., gaze salience, gesture.
Synchronization of nonverbal and verbal behavior
Learning algorithms for cooperative intelligence, e.g., imitation learning.
Generative and adversarial algorithms to enhance human-robot interaction, e.g., LLMs, diffusion models, VLMs.
Empathetic interaction between humans and intelligent systems.
Robust sensing of facial and body key points.
Social interaction dynamics modeling, e.g., harmony level, engagements.
Personalization of intelligent systems from nonverbal cues and trust evaluation.
Applications of cooperative intelligence in the wild.

Keywords: "Human: Face, gaze, body, pose, gesture, movement, attention, cognitivestate, emotion state, intention, empathy, Environment: Object"

Secondary subject: "Human-Robot cooperative intelligence", "Nonverbal cues recognition from audiovisual", "Human internal state inference from multi-modality", "Vision applications and systems", "Human-Object interaction and scene understanding"

Organizers

News updates

Jul 1st	Workshop web page was launched.
Jul 2nd	Submission can be made on EasyChair.
Aug 14th	Workshop paper submission deadline is extended to Aug 24th (final) due to multiple requests.
Sep 30th	Workshop program is finalized! See you there at IROS2024 on Oct 14th!
Dec 3rd	Call For Papers for Advanced Robotics (2023 IF: 1.4) Special Issue on Nonverbal Cues for Human-Robot Cooperative Intelligence.
May 24th	The submission deadline of Advanced Robotics Special Issue on Nonverbal Cues for Human-Robot Cooperative Intelligence has been extended to 30 June 2025! (FINAL extension)

Call for Papers

Submission Guidelines

We invite authors to submit unpublished papers (2-4 pages excluding references) to our workshop, to be presented at a workshop session upon acceptance. Submissions will undergo a peer-review process by the workshop's program committee and accepted papers will be invited to present their works at the workshop (see presentation format).

We are pleased to announce that award will be given to the best paper accepted by this workshop.

Important Dates

~~August 14, 2024 PST~~
~~Paper submission deadline~~
~~August 24, 2024 PST~~
~~Extended paper submission deadline (final)~~
~~September 14, 2024 PST~~
~~Notification of acceptance~~
~~September 28, 2024 PST~~
~~Camera-ready paper submission deadline~~
~~September 28, 2024 PST~~
~~Pre-recorded video submission deadline~~
October 14, 2024
~~Workshop day~~

Submission Instructions

Please use the IEEE conferences paper format to write your manuscript.

LaTex and MS Word template: https://www.ieee.org/conferences/publishing/templates.html

Please submit your paper electronically through the workshop's EasyChair submission system.

Submit papers on EasyChair

Presentation Format

Accepted papers should be presented in three-way presentation approach to foster active participation

Spotlight talks (6 mins talk, Q&A in the poster session)
In-person A0 posters for in-depth discussions
Short pre-recorded videos (about 2 minutes) to be uploaded on the workshop webpage

Publication Format

Authors are recommended to archive their papers and inform workshop organizers once this procedure is completed.

Link to arXiv: https://info.arxiv.org/help/submit/index.html

Accepted papers which have been archived will be hosted on the workshop webpage.

Program

We plan a half-day event for 4 hours, including talks by two invited speakers, and one interactive session. For participants who could not attend in person, we will disseminate the papers and pre-recorded videos on our workshop page, which also consists of a comment section for Q&A.

09:00 09:02 Welcome and opening remarks (2 mins)
09:02 09:42 Invited talk I by Satoshi Shigemi (40 mins, including 5 mins Q&A)
09:42 10:30 Spotlight talks for accepted workshop and invited IROS 2024 papers (6 + 2 papers, 6 mins each)
10:30 11:00 Coffee break and poster session (30 mins)
11:00 11:40 Invited talk II by Yukie Nagai (40 mins, including 5 mins Q&A)
11:40 12:20 Invited talk III by Angelica Lim (40 mins, including 5 mins Q&A)
12:20 12:55 Interactive session (35 mins)
12:55 13:00 Awards and closing remarks (5 mins)

Invited Speakers

We intend to have speakers from different ethnic backgrounds, countries, and career stages. Specifically, we confirmed the attendance of one speaker from the industry, and the other speaker from the academia is pending confirmation.

Invited Talk I

Co-Existence with Intelligent Machines

Satoshi Shigemi, Honda Research Institute Japan, Japan.

Link to website: http://www.jp.honda-ri.com/en/about/

Abstract

At the Honda Research Institute Japan, we are conducting research on collaborative intelligence, human understanding, and robot systems. In recent years, generative AI has emerged, and AI technology is becoming more familiar in people's lives. For people and machines to coexist 24 hours a day, 365 days a year, it is essential for intelligent machine systems to understand people's feelings and act accordingly. To achieve this, we focus on human social activities, and our research targets are interactions between individuals, groups, and communities. We aim to advance people's happiness by helping them become the way they ought to be and enhancing their fulfillment. Here, we will introduce the research technologies at the Honda Research Institute to make people happy.

Biography

Satoshi Shigemi is the President of Honda Research Institute Japan. Since 1987, he has been conducting research on robots and control systems at Honda R&D Co. In 2000, he was the Senior Chief Engineer and project lead for the research and development of ASIMO, the humanoid robot. He then developed a high-altitude survey robot for the Fukushima Daiichi Nuclear Power Plant. He has published many papers about human-robot interaction.

Invited Talk II

Emergence of Cooperative Intelligence through Embodied Predictive Processing

Yukie Nagai, The University of Tokyo, Japan.

Link to website: https://developmental-robotics.jp/en/members/yukie_nagai/

Abstract

Embodied predictive processing extends the brain-centered theory of predictive processing by emphasizing the critical role of the body and social interactions in cognitive functions. This approach highlights how prediction error minimization is dynamically mediated across brain, body, and environment. In this talk, we examine how cooperative intelligence can arise within this framework. Through our experiments, we demonstrate how robots, grounded in their sensorimotor experiences, can anticipate and adapt to the actions and emotions of others, illustrating the potential for collaborative learning and human-robot interactions.

Biography

Yukie Nagai is a Project Professor at the International Research Center for Neurointelligence at the University of Tokyo. She earned her Ph.D. in Engineering from Osaka University in 2004, after which she worked at the National Institute of Information and Communications Technology, Bielefeld University, and then Osaka University. Since 2019, she has been leading the Cognitive Developmental Robotics Lab at the University of Tokyo. Her research encompasses cognitive developmental robotics, computational neuroscience, and assistive technologies for developmental disorders. Dr. Nagai employs computational methods to investigate the underlying neural mechanisms involved in social cognitive development. In acknowledgment of her work, she received the titles of "World's 50 Most Renowned Women in Robotics" in 2020 and "35 Women in Robotics Engineering and Science" in 2022, among other recognitions.

Invited Talk III

Multimodal Social Signal Processing for Human-Robot Interaction

Angelica Lim, Simon Fraser University

Link to website: https://www.sfu.ca/computing/people/faculty/angelicalim.html

Abstract

Science fiction has long promised us interfaces and robots that interact with us as smoothly as humans do - Rosie the Robot from The Jetsons, C-3PO from Star Wars, and Samantha from Her. Today, interactive robots and voice user interfaces are moving us closer to effortless, human-like interactions in the real world. In this talk, I will discuss the opportunities and challenges in finely analyzing, detecting and generating non-verbal communication in context, including gestures, gaze, auditory signals, and facial expressions. Specifically, I will discuss how we might allow robots and virtual agents to understand human social signals (including emotions, mental states, and attitudes) across cultures as well as recognize and generate expressions with controllability, transparency, and diversity in mind.

Biography

Dr. Angelica Lim is the Director of the Rosie Lab, and an Assistant Professor in the School of Computing Science at Simon Fraser University (SFU). Previously, she led the Emotion and Expressivity teams for the Pepper humanoid robot at SoftBank Robotics. She received her B.Sc. in Computing Science with Artificial Intelligence Specialization from SFU and a Ph.D. and M.Sc. in Computer Science (Intelligence Science) from Kyoto University, Japan. She and her team have received Best Paper in Entertainment Robotics and Cognitive Robotics Awards at IROS 2011 and 2022, and Best Demo and LBR at HRI 2021 and 2023. She has been featured on the BBC, TEDx, hosted a TV documentary on robotics, and was recently featured in Forbes 20 Leading Women in AI. Her research interests include multimodal machine learning, affective computing, and human-robot interaction.

Spotlight talks & poster

BTGenBot: Behavior Tree Generation for Robotic Tasks with Lightweight LLMs Izzo, Riccardo Andrea; Bardaro, Gianluca*; Matteucci, Matteo

PDF video

GazeMotion: Gaze-guided Human Motion Forecasting Hu, Zhiming*; Schmitt, Syn; Haeufle, Daniel Florian Benedict; Bulling, Andreas

PDF video

Are Large Language Models Aligned with People’s Social Intuitions for Human–Robot Interactions? Wachowiak, Lennart; Coles, Andrew; Celiktutan, Oya; Canal, Gerard*

PDF video

To Help or Not to Help: LLM-based Attentive Support for Human-Robot Group Interactions Tanneberg, Daniel*; Ocker, Felix; Hasler, Stephan; Deigmoeller, Joerg; Belardinelli, Anna; Wang, Chao; Wersing, Heiko; Sendhoff, Bernhard; Gienger, Michael

PDF video

A kinematic model generates non-circular human proxemics zones Fanta Camara*, Charles Fox

PDF video

Autonomous Storytelling for Social Robot with Human-Centered Reinforcement Learning Zhang, Lei; Zheng, Chuanxiong; Wang, Hui; Gomez, Randy; Nichols, Eric; Li, Guangliang*

PDF video

Contextual Emotion Recognition using Large Vision Language Models Best Workshop Paper Award Etesam, Yasaman*; Yalcin, Ozge; Zhang, Chuxuan; Lim, Angelica

PDF video

MoVEInt: Mixture of Variational Experts for Learning HRI from DemonstrationsBest Workshop Paper Award Vignesh Prasad*, Alap Kshirsagar, Dorothea Koert, Ruth Stock-Homburg, Jan Peters and Georgia Chalvatzaki

PDF video

Poster

React to This! How Humans Challenge Interactive Agents Using Nonverbal Behaviors Chuxuan Zhang*, Bermet Burkanova, Lawrence H. Kim, Lauren Yip, Ugo Cupcic, Stephane Lallee, Angelica Lim

PDF video

Deep Active Inference for Engagement Recognition in Robot-Assisted Autism Therapy Shyrailym Shaldambayeva*, Saparkhan Kassymbekov, Anara Sandygulova and Almas Shintemirov

PDF Poster

Human-Centric Robot Navigation: Leveraging Pedestrian Occlusion Patterns for Traversability Analysis in Crowded Indoor Environments Jonathan Tay Yu Liang* and Kanji Tanaka

PDF Poster

Interactive session

Forum on the Impact of Generative AI On Making Emotionally Intelligent Systems

Moderator: Jouh Yeong Chew, Honda Research Institute Japan, Scientist

Panelists:

Yukie Nagai, The University of Tokyo (Japan), Professor
Daisuke Kurabayashi, Institute of Science Tokyo (previously Tokyo Institute of Technology) (Japan), Professor, also the Editor-in-Chief for Advanced Robotics
Angelica Lim, Simon Fraser University (Canada), Assistant Professor
Gerard Canal, King’s College London (UK), Lecturer (Assistant Professor) in Autonomous Systems at the Department of Informatics
Eiichi Yoshida, Tokyo University of Science (Japan), Professor, also an IEEE Robotics and Automation Society AdCom member

Motivation and Background

Humans can perceive social cues and the interaction context of another human to infer the internal states including cognitive and emotional states, empathy, and intention. This unique ability to infer internal states leads to effective social interaction between humans desirable in many intelligent systems such as collaborative and social robots, and humanmachine interaction systems. However, it is challenging for machines to perceive human states under noisy real-world settings, which are usually measured by noninvasive sensors. Recent works investigating the potential solutions for the estimation of human states under controlled conditions using facial features with the off-the-shelf camera by leveraging deep learning methods. This workshop aims to bring interdisciplinary researchers across computer vision, artificial intelligence, robotics, and human-computer interaction together to share current research achievements and discuss future research directions for human behavior and state understanding, and their potential application, especially in the wild environment. Specifically, we are interested in cognition-aware computing by integrating environment contexts and multi-modal nonverbal social cues not limited to gaze interaction, body language and para language. More importantly, we extend multi-modal human behavior research to infer the internal states of humans. This is a challenging problem yet important to realize effective interaction between humans and intelligent systems.

It is desirable for intelligent systems like robots, virtual agents, human-machine interfaces to collaborate and interact seamlessly with humans in the era of Industry 5.0, where intelligent systems must work alongside humans to perform a variety of tasks anywhere at home, factories, offices, transit, etc. The underlying technologies to achieve efficient and intelligent collaboration between humans and ubiquitous intelligent systems can be realized by cooperative intelligence, spanning interdisciplinary studies between robotics, AI, human-robot and -computer interaction, computer vision, cognitive science, etc.

One of the main considerations to achieve cooperative intelligence between humans and intelligent systems is to enable everyone and everything to know each other well, like how humans can trust or infer the implicit internal states like intention, emotion, and cognitive states of each other. The importance of empathy to facilitate human-robot interaction has been highlighted in previous studies . However, it is difficult for intelligent systems to estimate the internal states of humans because they are dependent on the complex social dynamics and environment contexts. This requires intelligent systems to be capable of sensing the multi-modal inputs, reasoning the underlying abstract knowledge, and generating the corresponding responses to collaborate and interact with humans.

There are many studies on estimating internal states of humans through measurements of wearables and non-invasive sensors, but it would be difficult to implement these solutions in the wild because of the additional sensors to be worn by humans. One promising solution is to use audiovisual data like nonverbal behavior cues consisting of gaze interaction, facial expression, body language and paralanguage to infer the internal states of humans. Researchers in cognitive and social psychology have long advocated that these nonverbal behaviors are subconsciously generated by humans and reflect the internal states of humans under different contexts. Some salient examples are the studies on emotion recognition using facial and body language in controlled environment. It remains an open question for intelligent systems to sense and recognize nonverbal cues and reason the rich underlying internal states of humans in the wild and noisy environments.

Organizers

Jouh Yeong Chew

Honda Research Institute Japan

jouhyeong.chew@jp.honda-ri.com

Xucong Zhang

TU Delft

xucong.zhang@tudelft.nl

Iolanda Leite

KTH Royal Institute of Technology

iolanda@kth.se

Daisuke Kurabayashi

Tokyo Institute of Technology

kurabayashi.d.aa@m.titech.ac.jp

Eiichi Yoshida

Tokyo University of Science

eiichi.yoshida@rs.tus.ac.jp

Siyu Tang

ETH Zürich

siyu.tang@inf.ethz.ch

Andreas Bulling

University of Stuttgart

andreas.bulling@vis.uni-stuttgart.de

Workshop on Nonverbal Cues for Human-Robot Cooperative Intelligence

About the Workshop

Sponsors

Organizers

News updates

Call for Papers

Submission Guidelines

Important Dates

Submission Instructions

Presentation Format

Publication Format

Program

Invited Speakers

Invited Talk I

Co-Existence with Intelligent Machines

Invited Talk II

Emergence of Cooperative Intelligence through Embodied Predictive Processing

Invited Talk III

Multimodal Social Signal Processing for Human-Robot Interaction

Spotlight talks & poster

Poster

Interactive session

Forum on the Impact of Generative AI On Making Emotionally Intelligent Systems

Motivation and Background

Organizers

Jouh Yeong Chew

Xucong Zhang

Iolanda Leite

Daisuke Kurabayashi

Eiichi Yoshida

Siyu Tang

Andreas Bulling