Leolani: a reference machine with a theory of mind for social communication (1806.01526v1)

Published 5 Jun 2018 in cs.AI, cs.CL, and cs.HC

Abstract: Our state of mind is based on experiences and what other people tell us. This may result in conflicting information, uncertainty, and alternative facts. We present a robot that models relativity of knowledge and perception within social interaction following principles of the theory of mind. We utilized vision and speech capabilities on a Pepper robot to build an interaction model that stores the interpretations of perceptions and conversations in combination with provenance on its sources. The robot learns directly from what people tell it, possibly in relation to its perception. We demonstrate how the robot's communication is driven by hunger to acquire more knowledge from and on people and objects, to resolve uncertainties and conflicts, and to share awareness of the per- ceived environment. Likewise, the robot can make reference to the world and its knowledge about the world and the encounters with people that yielded this knowledge.

Citations (14)

View on Semantic Scholar

Summary

The paper presents a novel robotics architecture that integrates a theory of mind to process and resolve conflicting social perceptions.
It details a multi-layered system combining GRaSP-based knowledge representation, BDI conversation models, and robust sensory modules for effective human-robot dialogue.
Experimental results demonstrate Leolani’s ability to manage ambiguity and foster natural, interactive communication in social contexts.

An Overview of "Leolani: a reference machine with a theory of mind for social communication"

The paper Leolani: a reference machine with a theory of mind for social communication presents a novel approach to enhancing human-robot interaction by integrating a theory of mind into robotic communication frameworks. This work, conducted at the VU University Amsterdam, involves equipping a Pepper robot with the cognitive capability to model and manage the relativity of knowledge and perception within social contexts.

Theoretical Framework

At the core of Leolani’s design is the theory of mind, a concept stating that understanding others' beliefs, intentions, and perceptions is crucial for effective social interaction. The robot's implementation is informed by Scassellati's pioneering work, given its emphasis on social cognition in humanoid robots. The significant advancement in Leolani is its ability to incorporate conflicting and uncertain information regarding human interactions and sensor data, thus mimicking a human-like processing of varied inputs.

Technical Approach

To achieve this, the robot utilizes the Grounded Representation and Source Perspective model (GRaSP), which operates on RDF triples to represent and attribute information to specific sources and sensory perceptions. The robot's cognitive system differentiates the provenance of information and assesses its certainty, reflecting nuanced interpretations akin to human reasoning.

Key Components of the System:

Signal Processing Layer: Handles inputs from audio and visual sensory modules, enabling speech recognition via the Google Speech API and object recognition utilizing TensorFlow’s Inception network.
Conversation Flow Layer: Based on Belief, Desire, Intention (BDI) architectures, this layer manages conversation objectives and engagements, prompting actions like question formulation to resolve knowledge gaps.
Natural Language Processing Layer: Converts natural language into structured RDF triples and vice versa, facilitating understanding and generation of human-like dialogues.
Knowledge Representation Layer: Employs the GRaSP ontology for storing and querying social knowledge, enriched by global knowledge databases like DBpedia and GeoNames.

Results and Implications

The research describes a functioning model of Leolani and demonstrates its ability to engage in meaningful social interaction by understanding conflicting perspectives and acting to resolve discrepancies or gaps in knowledge. The dialogues presented in the appendices illustrate Leolani’s conversational abilities, highlighting its strengths in managing ambiguity and promoting interactive dialogue centered on shared human-like understanding.

Implications and Future Developments

Practically, Leolani’s ability to navigate uncertain and contradictory information has significant implications for the development of socially intelligent robots that can more naturally integrate into human environments. Theoretically, the work is a step forward in AI research related to cognitive architectures emulating human social cognition.

Future research directions mentioned include the inclusion of inferencing capabilities to enable more sophisticated reasoning and decision-making processes. Extending Leolani’s capacity for task-based dialogues could improve the robot’s functionality in diverse practical applications.

By advancing human-robot interaction through the integration of a theory of mind, this research lays a foundation for more empathetic and socially aware AI systems that closely align with human communicative and cognitive processes.

PDF Markdown

Related Papers

YouTube

Show All Videos