Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 30 tok/s Pro
GPT-5 High 28 tok/s Pro
GPT-4o 82 tok/s Pro
Kimi K2 185 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Guidelines for creating man-machine multimodal interfaces (1901.10408v2)

Published 29 Jan 2019 in cs.CL and cs.HC

Abstract: Understanding details of human multimodal interaction can elucidate many aspects of the type of information processing machines must perform to interact with humans. This article gives an overview of recent findings from Linguistics regarding the organization of conversation in turns, adjacent pairs, (dis)preferred responses, (self)repairs, etc. Besides, we describe how multiple modalities of signs interfere with each other modifying meanings. Then, we propose an abstract algorithm that describes how a machine can implement a double-feedback system that can reproduces a human-like face-to-face interaction by processing various signs, such as verbal, prosodic, facial expressions, gestures, etc. Multimodal face-to-face interactions enrich the exchange of information between agents, mainly because these agents are active all the time by emitting and interpreting signs simultaneously. This article is not about an untested new computational model. Instead, it translates findings from Linguistics as guidelines for designs of multimodal man-machine interfaces. An algorithm is presented. Brought from Linguistics, it is a description pointing out how human face-to-face interactions work. The linguistic findings reported here are the first steps towards the integration of multimodal communication. Some developers involved on interface designs carry on working on isolated models for interpreting text, grammar, gestures and facial expressions, neglecting the interwoven between these signs. In contrast, for linguists working on the state-of-the-art multimodal integration, the interpretation of separated modalities leads to an incomplete interpretation, if not to a miscomprehension of information. The algorithm proposed herein intends to guide man-machine interface designers who want to integrate multimodal components on face-to-face interactions as close as possible to those performed between humans.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube