Emergent Mind

Slot Structured World Models

(2402.03326)
Published Jan 8, 2024 in cs.CV and cs.LG

Abstract

The ability to perceive and reason about individual objects and their interactions is a goal to be achieved for building intelligent artificial systems. State-of-the-art approaches use a feedforward encoder to extract object embeddings and a latent graph neural network to model the interaction between these object embeddings. However, the feedforward encoder can not extract {\it object-centric} representations, nor can it disentangle multiple objects with similar appearance. To solve these issues, we introduce {\it Slot Structured World Models} (SSWM), a class of world models that combines an {\it object-centric} encoder (based on Slot Attention) with a latent graph-based dynamics model. We evaluate our method in the Spriteworld benchmark with simple rules of physical interaction, where Slot Structured World Models consistently outperform baselines on a range of (multi-step) prediction tasks with action-conditional object interactions. All code to reproduce paper experiments is available from \url{https://github.com/JonathanCollu/Slot-Structured-World-Models}.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.