Provable Compositional Generalization for Object-Centric Learning (2310.05327v2)
Abstract: Learning representations that generalize to novel compositions of known concepts is crucial for bridging the gap between human and machine perception. One prominent effort is learning object-centric representations, which are widely conjectured to enable compositional generalization. Yet, it remains unclear when this conjecture will be true, as a principled theoretical or empirical understanding of compositional generalization is lacking. In this work, we investigate when compositional generalization is guaranteed for object-centric representations through the lens of identifiability theory. We show that autoencoders that satisfy structural assumptions on the decoder and enforce encoder-decoder consistency will learn object-centric representations that provably generalize compositionally. We validate our theoretical result and highlight the practical relevance of our assumptions through experiments on synthetic image data.
- How to grow a mind: Statistics, structure, and abstraction. Science, 331:1279 – 1285, 2011.
- What is a cognitive map? organizing knowledge for flexible behavior. Neuron, 100(2):490–509, 2018. ISSN 0896-6273. doi: https://doi.org/10.1016/j.neuron.2018.10.002.
- Toward causal representation learning. Proceedings of the IEEE, 109(5):612–634, 2021.
- Connectionism and cognitive architecture: A critical analysis. Cognition, 28:3–71, 1988.
- Building machines that learn and think like people. Behavioral and Brain Sciences, 40:e253, 2017. doi: 10.1017/S0140525X16001837.
- Relational inductive biases, deep learning, and graph networks. ArXiv, abs/1806.01261, 2018.
- Inductive biases for deep learning of higher-level cognition. Proceedings of the Royal Society A, 478(2266):20210068, 2022.
- On the binding problem in artificial neural networks. arXiv preprint arXiv:2012.05208, 2020.
- MONet: Unsupervised Scene Decomposition and Representation, January 2019.
- Multi-object representation learning with iterative variational inference. In ICML, volume 97 of Proceedings of Machine Learning Research, pages 2424–2433, 2019.
- Object-Centric Learning with Slot Attention. In Advances in Neural Information Processing Systems, volume 33, pages 11525–11538. Curran Associates, Inc., 2020a.
- Space: Unsupervised object-oriented scene representation via spatial attention and decomposition. In International Conference on Learning Representations, 2020.
- Illiterate DALL-e learns to compose. In International Conference on Learning Representations, 2022.
- Savi++: Towards end-to-end object-centric learning from real-world videos. Advances in Neural Information Processing Systems, 35:28940–28954, 2022.
- Bridging the gap to real-world object-centric learning. In The Eleventh International Conference on Learning Representations, 2023.
- Contrastive learning of structured world models. In International Conference on Learning Representations, 2020.
- Nonlinear independent component analysis for principled disentanglement in unsupervised deep learning. ArXiv, abs/2303.16535, 2023.
- Provably learning object-centric representations. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 3038–3062. PMLR, 23–29 Jul 2023.
- Toward compositional generalization in object-oriented world modeling. In International Conference on Machine Learning, pages 26841–26864. PMLR, 2022.
- Learning and generalization of compositional representations of visual scenes. arXiv preprint arXiv:2303.13691, 2023.
- Compositional generalization from first principles. arXiv preprint arXiv:2307.05596, 2023.
- Compositional scene representation learning via reconstruction: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- Nonlinear independent component analysis: Existence and uniqueness results. Neural Networks, 12(3):429–439, 1999. ISSN 0893-6080.
- Challenging common assumptions in the unsupervised learning of disentangled representations. In ICML, volume 97 of Proceedings of Machine Learning Research, pages 4114–4124, 2019.
- The role of disentanglement in generalisation. In International Conference on Learning Representations, 2021.
- Lost in latent space: Examining failures of disentangled models at combinatorial generalisation. In Advances in Neural Information Processing Systems, volume 35, pages 10136–10149. Curran Associates, Inc., 2022.
- Visual representation learning does not generalize strongly within the same domain. In International Conference on Learning Representations, 2022.
- H. W. Kuhn. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2(1-2):83–97, March 1955. ISSN 00281441, 19319193. doi: 10.1002/nav.3800020109.
- Unsupervised feature extraction by time-contrastive learning and nonlinear ICA. In NIPS, pages 3765–3773, 2016.
- Nonlinear ICA of temporally dependent stationary sources. In AISTATS, volume 54 of Proceedings of Machine Learning Research, pages 460–469, 2017.
- Nonlinear ICA using auxiliary variables and generalized contrastive learning. In AISTATS, volume 89 of Proceedings of Machine Learning Research, pages 859–868, 2019.
- Variational autoencoders and nonlinear ICA: A unifying framework. In AISTATS, volume 108 of Proceedings of Machine Learning Research, pages 2207–2217, 2020a.
- Ice-beem: Identifiable conditional energy-based deep models based on nonlinear ICA. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020b.
- Weakly supervised disentanglement with guarantees. In ICLR, 2020.
- Weakly-supervised disentanglement without compromises. In ICML, volume 119 of Proceedings of Machine Learning Research, pages 6348–6359, 2020b.
- The incomplete rosetta stone problem: Identifiability results for multi-view nonlinear ICA. In Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI 2019, Tel Aviv, Israel, July 22-25, 2019, volume 115 of Proceedings of Machine Learning Research, pages 217–227, 2019.
- Disentanglement via mechanism sparsity regularization: A new principle for nonlinear ica. In First Conference on Causal Learning and Reasoning, 2021.
- Towards nonlinear disentanglement in natural data with temporal sparse coding. In ICLR, 2021.
- Disentangling identifiable features from noisy data with structured nonlinear ICA. In NeurIPS, pages 1624–1633, 2021.
- Self-supervised learning with data augmentations provably isolates content from style. In Advances in Neural Information Processing Systems, volume 34, pages 16451–16467, 2021.
- Causal component analysis. ArXiv, abs/2305.17225, 2023.
- Independent mechanism analysis, a new concept? Advances in Neural Information Processing Systems, 34:28233–28248, 2021.
- When is unsupervised disentanglement possible? In Advances in Neural Information Processing Systems, 2021.
- Identifiable deep generative models via sparse decoding. Transactions on Machine Learning Research, 2022. ISSN 2835-8856.
- Function classes for identifiable nonlinear independent component analysis. In NeurIPS, 2022.
- On the identifiability of nonlinear ICA: sparsity and beyond. In NeurIPS, 2022.
- Learning to Extrapolate: A Transductive Approach. In The Eleventh International Conference on Learning Representations, February 2023.
- First Steps Toward Understanding the Extrapolation of Nonlinear Models to Unseen Domains. In The Eleventh International Conference on Learning Representations, September 2022.
- Additive decoders for latent variables identification and cartesian-product extrapolation. arXiv preprint arXiv:2307.02598, 2023.
- Generative replay for compositional visual understanding in the prefrontal-hippocampal circuit. bioRxiv, 2021. doi: 10.1101/2021.06.06.447249.
- Replay and compositional computation. Neuron, 111:454–469, 2022.
- Constructing future behaviour in the hippocampal formation through composition and replay. bioRxiv, 2023. doi: 10.1101/2023.04.07.536053.
- Taming vaes. arXiv preprint arXiv:1810.00597, 2018.
- The autoencoding variational autoencoder. Advances in Neural Information Processing Systems, 33:15077–15087, 2020.
- Consistency regularization for variational auto-encoders. Advances in Neural Information Processing Systems, 34:12943–12954, 2021.
- Exploring the latent space of autoencoders with interventional assays. In Advances in Neural Information Processing Systems, 2022.
- Dreamcoder: growing generalizable, interpretable knowledge with wake–sleep bayesian program learning. Philosophical Transactions of the Royal Society A, 381(2251):20220050, 2023.
- Object-centric compositional imagination for visual abstract reasoning. In ICLR2022 Workshop on the Elements of Reasoning: Objects, Structure and Causality, 2022.
- Spriteworld: A flexible, configurable reinforcement learning environment. https://github.com/deepmind/spriteworld/, 2019.
- Generalization and robustness implications in object-centric learning. In International Conference on Machine Learning, 2021.
- Understanding disentangling in β𝛽\betaitalic_β-vae, 2018.
- Decoupled weight decay regularization. In International Conference on Learning Representations, 2019.
- Pytorch: An imperative style, high-performance deep learning library. In NeurIPS, pages 8024–8035, 2019.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.