Effects of Realism and Representation on Self-Embodied Avatars in Immersive Virtual Environments (2405.02672v1)

Published 4 May 2024 in cs.HC and cs.GR

Abstract: Virtual Reality (VR) has recently gained traction with many new and ever more affordable devices being released. The increase in popularity of this paradigm of interaction has given birth to new applications and has attracted casual consumers to experience VR. Providing a self-embodied representation (avatar) of users' full bodies inside shared virtual spaces can improve the VR experience and make it more engaging to both new and experienced users . This is especially important in fully immersive systems, where the equipment completely occludes the real world making self awareness problematic. Indeed, the feeling of presence of the user is highly influenced by their virtual representations, even though small flaws could lead to uncanny valley side-effects. Following previous research, we would like to assess whether using a third-person perspective could also benefit the VR experience, via an improved spatial awareness of the user's virtual surroundings. In this paper we investigate realism and perspective of self-embodied representation in VR setups in natural tasks, such as walking and avoiding obstacles. We compare both First and Third-Person perspectives with three different levels of realism in avatar representation. These range from a stylized abstract avatar, to a "realistic" mesh-based humanoid representation and a point-cloud rendering. The latter uses data captured via depth-sensors and mapped into a virtual self inside the Virtual Environment. We present a throughout evaluation and comparison of these different representations, describing a series of guidelines for self-embodied VR applications. The effects of the uncanny valley are also discussed in the context of navigation and reflex-based tasks.

References (35)

Summary

The paper empirically studies how avatar realism and perspective affect user embodiment and task performance in immersive virtual reality.
First-person perspective improves precise task performance and reaction times but third-person perspective enhances spatial awareness.
While realistic avatars can increase embodiment, they risk uncanny valley effects and visual occlusion issues, whereas abstract avatars avoid these trade-offs.

Overview

The study rigorously investigates the impact of avatar realism and representational perspective in immersive VR settings, examining self-embodiment through naturalistic tasks such as locomotion and obstacle avoidance. The experimental design spans three avatar representations—abstract, realistic mesh, and point-cloud—with each deployed in both first-person perspective (1PP) and third-person perspective (3PP). Quantitative performance metrics (task completion time, collision incidence) and qualitative measures (sense of embodiment, spatial awareness, fatigue) are employed to elucidate the interaction between realism, visual occlusion, and the uncanny valley phenomenon.

Experimental Design and Methodology

The experimental paradigm consists of controlled navigation and reflex-based tasks within a fully occluded HMD environment. The study systematically contrasts:

Perspective Variation:
- First-Person Perspective (1PP): Direct immersion offering enhanced embodiment, critical for precise motor actions and immediate task responses.
- Third-Person Perspective (3PP): Elevated spatial awareness with the potential drawback of reduced proprioceptive fidelity and balance.
Avatar Representation:
- Abstract Avatars: Minimalist representations that avoid detailed human features, thereby circumventing uncanny valley effects.
- Realistic Mesh Avatars: High-fidelity mesh-based embodiments that may elicit uncanny valley responses, particularly in 3PP.
- Point-Cloud Representations: Direct, sensor-captured visualizations of the user’s form, offering high embodiment fidelity but prone to generating occlusion artifacts.

The study employs a controlled combination protocol where each avatar representation is paired with both perspectives. Performance metrics such as navigation efficiency (completion time) and safety (collision frequency) are gathered, alongside subjective responses gathered via post-task questionnaires aiming to capture self-reported embodiment levels and perceived spatial orientation.

Key Findings and Quantitative Analysis

Perspective and Task Performance

1PP vs. 3PP:

1PP consistently yielded superior performance for tasks requiring precision and fast reflex responses. The enhanced embodiment in 1PP provided significantly lower collision counts and facilitated improved reaction times. Conversely, 3PP was associated with a reduction in proprioceptive integration, leading to notable balance issues and increased task-induced fatigue.

Numerical Trade-offs:

For instance, while a 1PP configuration with abstract avatars resulted in minimal occlusion and the fastest task completion times, the point-cloud representation in 1PP, despite its high fidelity, achieved faster completion rates at the cost of increased collision frequency—indicating a trade-off between speed and accuracy.

Impact of Avatar Realism

Abstract vs. Realistic Mesh:

Abstract avatars, by avoiding detailed human features, effectively mitigate operative uncanny valley effects, especially in 1PP. In contrast, realistic mesh avatars in 3PP experienced a marked uncanny valley response, which directly correlated with a reduction in navigation efficiency.

Point-Cloud Dynamics:

The point-cloud representation generated higher embodiment scores in 3PP relative to mesh avatars, but also introduced significant occlusion challenges. The resulting occlusion created a visual bottleneck during obstacle avoidance, thereby compromising task precision despite a seemingly heightened sense of self-representation.

Uncanny Valley Phenomenon

The uncanny valley effect is observed predominantly in scenarios where hyper-realistic meshes are used in 3PP. This effect manifests in diminished task efficacy, with users reporting reduced confidence and observable delays in navigational tasks. In an environment where embodiment is a critical factor—such as reflex-based tasks—this effect is further exacerbated, with statistical significance noted in performance decrement relative to both abstract and point-cloud representations.

Practical Guidelines for VR Application Design

Drawing from the empirical evidence, several key design guidelines are proposed for implementing self-embodied representations in VR:

Perspective Selection:
- Employ a first-person perspective for applications where rapid reflex actions and precise proprioceptive feedback are paramount.
- Utilize a third-person perspective selectively in scenarios where enhanced horizontal spatial awareness is necessary, with caution regarding balance issues.
Avatar Choice and Realism:
- For tasks where occlusion could hinder performance, abstract avatars are preferred due to their minimalistic design.
- In use cases where a high sense of self-presence is critical and sensor data is reliably captured, point-cloud representations are advantageous, albeit with a mechanism to mitigate occlusion artifacts (e.g., dynamic filtering or selective rendering techniques).
- Avoid intermediate realistic mesh avatars in 3PP if the uncanny valley risks outweigh the benefits, unless further enhancements can be integrated to reduce perceptual dissonance.
Trade-Off Management:

A significant observation is the speed-versus-accuracy trade-off inherent in realistic representations. For example, while 1PP with point-cloud avatars may lead to faster task execution, the corresponding increase in collision incidents necessitates an adaptive system to balance these metrics dynamically.

Fatigue and Balance Considerations:

The reported fatigue associated with point-cloud embodiments in 3PP suggests that VR systems should incorporate real-time monitoring of user strain. Strategies such as periodic perspective switching or adaptive rendering adjustments could be implemented to mitigate adverse effects on user experience.

Conclusion

This comprehensive study provides empirical insights into the nuanced interplay among avatar realism, representational perspective, and user embodiment in immersive virtual environments. The results underscore that while high-fidelity representations (point-cloud) can enhance subjective embodiment, they introduce occlusion issues and performance trade-offs. Conversely, abstract representations in a first-person perspective minimize these issues and optimize task performance, making them well-suited for environments where precise, reflex-based responses are critical. The articulated guidelines offer a pragmatic framework for VR application designers to systematically balance realism and performance, thereby enhancing the efficacy of self-embodied VR experiences in both recreational and professional domains.