The 4th Workshop on Nonverbal Cues for Human-Robot Cooperative Intelligence

banner_1
banner_2

About the Workshop

This workshop is dedicated to discussing computational methods for sensing and recognition of nonverbal cues and internal states in the wild to realize cooperative intelligence between humans and intelligent systems. We gather researchers from different expertise, yet having the common goal, motivation, and resolve to explore and tackle this delicate issue considering the practicality of industrial applications. We are calling for papers to discuss novel methods to realize human-robot cooperative intelligence by sensing and understanding humans’ behavior, internal states, and to generate empathetic interactions.

  • Human internal state inference, e.g., cognitive, emotional, intention models.
  • Recognition of nonverbal cues, e.g., gaze and attention, body language, para-language.
  • Multi-modal sensing fusion for scene perception.
  • Nonverbal behavior generation for robots/agents, e.g., gaze salience, gesture.
  • Synchronization of nonverbal and verbal behavior
  • Learning algorithms, e.g., cross-embodiment and cross-context learning, imitation learning.
  • Generative and adversarial algorithms to enhance human-robot interaction, e.g., LLMs, diffusion models, VLMs.
  • Empathetic interaction between humans and intelligent systems.
  • Robust sensing of facial and body key points.
  • Modeling of social dynamics, e.g., group harmony, engagement, and cohesion.
  • Personalization and trust modeling based on multimodal nonverbal cues.
  • Real-world applications of cooperative intelligence.

Keywords: "Human: Face, gaze, body, pose, gesture, movement, attention, cognitivestate, emotion state, intention, empathy, Environment: Object"

Secondary subject: "Human-Robot cooperative intelligence", "Nonverbal cues recognition from audiovisual", "Human internal state inference from multi-modality", "Vision applications and systems", "Human-Object interaction and scene understanding"

Sponsors

Organizers

News updates

Mar 24th Workshop webpage was launched.

Call for Papers

Submission Guidelines

We invite authors to submit unpublished papers (2-4 pages excluding references) to our workshop, to be presented at a workshop session upon acceptance. Submissions will undergo a peer-review process by the workshop's program committee and accepted papers will be invited to present their works at the workshop (see presentation format).

We are pleased to announce that award will be given to the best paper accepted by this workshop.

Important Dates

TBD

Submission Instructions

Please use the IEEE conferences paper format to write your manuscript. Please submit your paper electronically through the workshop's EasyChair submission system.

Presentation Format

Accepted papers should be presented in three-way presentation approach to foster active participation

Publication Format

Authors are recommended to archive their papers and inform workshop organizers once this procedure is completed. Accepted papers which have been archived will be hosted on the workshop webpage.
As with the previous IROS2024 workshop, extensions of the papers presented at this ICRA2025 workshop will be invited to submit to a special issue journal to-be-announced at a later date.

Program

We plan a half-day event for 4 hours, including talks by two invited speakers. For participants who could not attend in person, we will disseminate the papers and pre-recorded videos on our workshop page, which also consists of a comment section for Q&A.

Invited Speakers

We intend to have speakers from different ethnic backgrounds, countries, and career stages. Specifically, we confirmed the attendance of two speakers.

no image

Invited Talk I

Learning manipulative skills through non-verbal interactions

Jens Kober, (Professor, University of Stuttgart, Germany)
Link to website:
Abstract

Jens Kober (Professor, University of Stuttgart, Germany) is a professor of cognitive robotics whose research focuses on robot learning through interaction, including motor skill learning, reinforcement learning, imitation learning, and human-in-theloop learning. His work on enabling robots to learn from and collaborate with humans in dynamic, real-world environments align closely with the workshop’s focus on multiparty interaction, embodied AI, and the integration of multimodal (including nonverbal) cues. His expertise in interactive and adaptive learning systems contributes directly to advancing socially aware and collaborative human– AI/robot interaction.

no image

Invited Talk II

Nonverbal Communication Strategies for Clarifying Ambiguities in Human-Robot Interaction

Fethiye Irmak Doğan, (PD Research Associate, University of Cambridge, UK)
Link to website:
Abstract

Fethiye Irmak Doğan (PD Research Associate, University of Cambridge, UK) is a Postdoctoral Research Associate at the University of Cambridge, whose work focuses on human–robot interaction, multimodal communication, and socially aware AI. Her research on enabling robots to interpret and generate contextually appropriate behaviors in human-centered environments aligns closely with the workshop’s focus on nonverbal cues, multiparty interaction, and embodied AI systems. Her expertise contributes directly to advancing empathetic, collaborative interactions between humans and intelligent systems.

Flash talks

TBD

Motivation and Background

Humans can perceive social cues and the interaction context of another human to infer the internal states including cognitive and emotional states, empathy, and intention. This unique ability to infer internal states leads to effective social interaction between humans desirable in many intelligent systems such as collaborative and social robots, and humanmachine interaction systems. However, it is challenging for machines to perceive human states under noisy real-world settings, which are usually measured by noninvasive sensors. Recent works investigating the potential solutions for the estimation of human states under controlled conditions using facial features with the off-the-shelf camera by leveraging deep learning methods. This workshop aims to bring interdisciplinary researchers across computer vision, artificial intelligence, robotics, and human-computer interaction together to share current research achievements and discuss future research directions for human behavior and state understanding, and their potential application, especially in the wild environment. Specifically, we are interested in cognition-aware computing by integrating environment contexts and multi-modal nonverbal social cues not limited to gaze interaction, body language and para language. More importantly, we extend multi-modal human behavior research to infer the internal states of humans. This is a challenging problem yet important to realize effective interaction between humans and intelligent systems.

It is desirable for intelligent systems like robots, virtual agents, human-machine interfaces to collaborate and interact seamlessly with humans in the era of Industry 5.0, where intelligent systems must work alongside humans to perform a variety of tasks anywhere at home, factories, offices, transit, etc. The underlying technologies to achieve efficient and intelligent collaboration between humans and ubiquitous intelligent systems can be realized by cooperative intelligence, spanning interdisciplinary studies between robotics, AI, human-robot and -computer interaction, computer vision, cognitive science, etc.

One of the main considerations to achieve cooperative intelligence between humans and intelligent systems is to enable everyone and everything to know each other well, like how humans can trust or infer the implicit internal states like intention, emotion, and cognitive states of each other. The importance of empathy to facilitate human-robot interaction has been highlighted in previous studies . However, it is difficult for intelligent systems to estimate the internal states of humans because they are dependent on the complex social dynamics and environment contexts. This requires intelligent systems to be capable of sensing the multi-modal inputs, reasoning the underlying abstract knowledge, and generating the corresponding responses to collaborate and interact with humans.

There are many studies on estimating internal states of humans through measurements of wearables and non-invasive sensors, but it would be difficult to implement these solutions in the wild because of the additional sensors to be worn by humans. One promising solution is to use audiovisual data like nonverbal behavior cues consisting of gaze interaction, facial expression, body language and paralanguage to infer the internal states of humans. Researchers in cognitive and social psychology have long advocated that these nonverbal behaviors are subconsciously generated by humans and reflect the internal states of humans under different contexts. Some salient examples are the studies on emotion recognition using facial and body language in controlled environment. It remains an open question for intelligent systems to sense and recognize nonverbal cues and reason the rich underlying internal states of humans in the wild and noisy environments.

Organizers

Jouh Yeong Chew
  • Affiliation
    Honda Research Institute Japan
  • Address
    8-1 Honcho, Wako-shi, Saitama, 351-0188, Japan
  • Phone
    +81 80 8896 9607
  • Short Bio

    Jouh Yeong is a senior scientist and project leader at HRI-JP working on nonverbal cues analysis and modelling for cooperative intelligence. He has organized NOC workshop series at IROS, ICRA, HAI and is a Senior/Guest Editor for Advanced Robotics.

Andreas Bulling
  • Affiliation
    University of Stuttgart, Germany
  • Phone
    +49 711 685 60048
  • Short Bio

    Andreas is a Professor at the University of Stuttgart. He is also Member of the Scientific Directorate of Schloss Dagstuhl - Leibniz Center for Informatics, an ELLIS Fellow and Founding Director of the Stuttgart ELLIS unit, Faculty and Member of the Executive Board of the IMPRSIS.

Daisuke Kurabayashi
  • Affiliation
    Institute of Science Tokyo
  • Phone
    +81 3 5734 2548
  • Short Bio

    Daisuke is a professor of Systems and Control Engineering and an associate dean of School of Engineering at Institute of Science Tokyo (previously known as TiTech), Japan. He has been serving as the editor-in-chief of Advanced Robotics since 2022.

Eiichi Yoshida
  • Affiliation
    Tokyo University of Science, Japan
  • Phone
    +81 3 5876 1717
  • Short Bio

    Eiichi is IEEE Robotics and Automation Society (RAS) AdCom member, Sponsor and Exhibitions Chair of IEEE ICRA2024, and served on the editorial boards of IEEE journals and conferences like IEEE Transaction on Robotics, IEEE RA-L, and IEEE/RSJ IROS.

Iolanda Leite
  • Affiliation
    KTH Royal Institute of Technology, Sweden
  • Phone
    +46 8 790 67 13
  • Short Bio

    Iolanda serves on the editorial boards of ACM Transactions on HRI, IEEE RA-L. She was the Program Co-Chair of the International Conference on Intelligent Virtual Agents (IVA) 2017, Awards Chair of the International Conference on Intelligent Robots and Systems (IROS) 2023, and Program Co-Chair of the Intl. Conf. Human-Robot Interaction 2023.

Sarah Gillet
  • Affiliation
    MIT Media Lab
  • Phone
    617 253 5960
  • Short Bio

    Sarah is a SERC Wallenberg PD fellow at the MIT Media Lab, Cambridge, MA, United Stated. Her research aims to create robots that shape human-human interactions while acting autonomously. She received her Ph.D. from KTH in 2024 and a Best User Studies Paper Award at HRI 2021.

Siyu Tang
  • Affiliation
    ETH Zürich
  • Phone
    +41 44 632 60 86
  • Short Bio

    Siyu is an assistant professor at ETH Zürich. She is the area chair for computer vision events including CVPR 2020, 2021, 2022, ECCV 2022 and ICCV 2021, tutorial Chairs for CVPR 2023 and ACCV 2020, and workshop organizers for ECCV 2022 EgoBody Benchmark. Her research focuses on computer vision and machine learning, specializing in perceiving and modelling humans.

Xucong Zhang
  • Affiliation
    Delft University of Technology
  • Phone
    +31 (0)15 27 89803
  • Short Bio

    Xucong is an assistant professor and an organizer of workshops GAZE series from 2019 to 2022. He is well known for his pioneering work on gaze estimation in real-world settings and is the author of numerous significant follow-up improvements in terms of methods.