Alalyani, Nada, authorKrishnaswamy, Nikhil, authorACM, publisher2024-11-112024-11-112023-10-09Nada Alalyani and Nikhil Krishnaswamy. 2023. A Methodology for Evaluating Multimodal Referring Expression Generation for Embodied Virtual Agents. In INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION (ICMI '23 Companion), October 09–13, 2023, Paris, France. ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/3610661.3616548https://hdl.handle.net/10217/239536Robust use of definite descriptions in a situated space often involves recourse to both verbal and non-verbal modalities. For IVAs, virtual agents designed to interact with humans, the ability to both recognize and generate non-verbal and verbal behavior is a critical capability. To assess how well an IVA is able to deploy multimodal behaviors, including language, gesture, and facial expressions, we propose a methodology to evaluate the agent's capacity to generate object references in a situational context, using the domain of multimodal referring expressions as a use case. Our contributions include: 1) developing an embodied platform to collect human referring expressions while communicating with the IVA. 2) comparing human and machine-generated references in terms of evaluable properties using subjective and objective metrics. 3) reporting preliminary results from trials that aimed to check whether the agent can retrieve and disambiguate the object the human referred to, if the human has the ability to correct misunderstanding using language, deictic gesture, or both; and human ease of use while interacting with the agent.born digitalarticleseng©Nada Alalyani, et al. ACM 2023. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ICMI '23, https://dx.doi.org/10.1145/3610661.3616548.embodied agentsnon-verbal behavioursmultimodalityreferring expression generationA methodology for evaluating multimodal referring expression generation for embodied virtual agentsTexthttps://doi.org/10.1145/3610661.3616548