Slot Structured World Models (2402.03326v1)

Published 8 Jan 2024 in cs.CV and cs.LG

Abstract: The ability to perceive and reason about individual objects and their interactions is a goal to be achieved for building intelligent artificial systems. State-of-the-art approaches use a feedforward encoder to extract object embeddings and a latent graph neural network to model the interaction between these object embeddings. However, the feedforward encoder can not extract {\it object-centric} representations, nor can it disentangle multiple objects with similar appearance. To solve these issues, we introduce {\it Slot Structured World Models} (SSWM), a class of world models that combines an {\it object-centric} encoder (based on Slot Attention) with a latent graph-based dynamics model. We evaluate our method in the Spriteworld benchmark with simple rules of physical interaction, where Slot Structured World Models consistently outperform baselines on a range of (multi-step) prediction tasks with action-conditional object interactions. All code to reproduce paper experiments is available from \url{https://github.com/JonathanCollu/Slot-Structured-World-Models}.

References (28)

Authors (4)

Jonathan Collu (2 papers)
Riccardo Majellaro (2 papers)
Aske Plaat (76 papers)
Thomas M. Moerland (24 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/gastronomy/status/1755098827957190956

Slot Structured World Models (2402.03326v1)

Summary

Related Papers

Tweets