Neurosymbolic AI: The 3rd Wave (2012.05876v2)

Published 10 Dec 2020 in cs.AI and cs.LG

Abstract: Current advances in AI and Machine Learning (ML) have achieved unprecedented impact across research communities and industry. Nevertheless, concerns about trust, safety, interpretability and accountability of AI were raised by influential thinkers. Many have identified the need for well-founded knowledge representation and reasoning to be integrated with deep learning and for sound explainability. Neural-symbolic computing has been an active area of research for many years seeking to bring together robust learning in neural networks with reasoning and explainability via symbolic representations for network models. In this paper, we relate recent and early research results in neurosymbolic AI with the objective of identifying the key ingredients of the next wave of AI systems. We focus on research that integrates in a principled way neural network-based learning with symbolic knowledge representation and logical reasoning. The insights provided by 20 years of neural-symbolic computing are shown to shed new light onto the increasingly prominent role of trust, safety, interpretability and accountability of AI. We also identify promising directions and challenges for the next decade of AI research from the perspective of neural-symbolic systems.

Authors (2)

Artur d'Avila Garcez (29 papers)
Luis C. Lamb (22 papers)

Citations (254)

View on Semantic Scholar

Summary

The paper introduces neurosymbolic AI as the next evolution by integrating neural networks’ distributed representations with symbolic localist frameworks for enhanced robustness and explainability.
It details various integration methods, including incorporating symbolic rules into neural loss functions and deploying hybrid modular architectures to combine learning with reasoning.
The work outlines applications in language understanding, planning, and robotics while addressing challenges like scalability and differentiability in end-to-end AI systems.

This paper, "Neurosymbolic AI: The 3rd Wave" (Garcez et al., 2020 ), positions neurosymbolic AI as the essential next step in building robust, trustworthy, and explainable AI systems, integrating the strengths of deep learning (neural networks) and symbolic AI (knowledge representation and reasoning). The authors argue that while deep learning has achieved impressive practical results in perception and pattern recognition (often associated with Daniel Kahneman's "System 1" thinking), it suffers from limitations like brittleness, lack of explainability, and poor generalization outside the training distribution. Symbolic AI, on the other hand, excels at reasoning, knowledge representation, and handling abstract concepts ("System 2" thinking), but traditionally struggles with learning from noisy data and scalability. Neurosymbolic AI aims to bridge this gap.

Core Concepts and Practical Implications:

The paper highlights the fundamental difference in representation between the two paradigms:

Distributed Representations: Used by neural networks, concepts are represented by dense, continuous vectors. Gradient-based methods are highly effective for learning in this space, particularly from large amounts of data.
Localist Representations: Used by symbolic AI, concepts have discrete identifiers (symbols). This allows for precise manipulation, structured knowledge representation, and logical reasoning.

Neurosymbolic AI seeks to integrate these, enabling systems that can learn from data, reason with structured knowledge, and offer interpretability.

Forms of Neurosymbolic Integration:

The paper discusses various approaches to combining neural and symbolic methods, which can be seen on a spectrum from loosely-coupled hybrid systems to tightly-coupled integrative systems:

Symbolic Knowledge Compilation: Symbolic knowledge (like rules or constraints) can be used to initialize or constrain the architecture and weights of a neural network (TYPE 4 systems). This injects prior knowledge into the learning process.
Symbolic Knowledge in Loss Function: Logical statements can be translated into differentiable constraints or penalties added to the neural network's loss function. This approach, exemplified by Logic Tensor Networks (LTN) [79], allows first-order logic to influence gradient-based learning without requiring exact symbolic computation within the network (TYPE 5 systems).
- Implementation: This often involves grounding logical concepts onto tensors and using many-valued logic interpretations (e.g., in [0,1]) to make logical satisfaction differentiable. The system then minimizes a loss that includes terms pushing towards logical consistency.
Hybrid Systems: Neural networks and symbolic systems operate as distinct modules that communicate with each other (TYPE 2 and TYPE 3).
- Implementation: A neural network might perform a perception task (e.g., object recognition), and its output (e.g., identified objects and their properties) is then fed as symbols or facts into a symbolic reasoning system (e.g., a logic program or planner) to perform higher-level reasoning (e.g., answering a question about the scene). DeepProbLog [50] is an example where neural network outputs can replace nodes in a probabilistic logic program's inference tree.
Integrated Systems: Symbolic computation or reasoning mechanisms are built directly into the neural network architecture, often in a differentiable manner (TYPE 6). Examples include attempts at differentiable unification or theorem proving within networks [54, 70, 71].
- Implementation: This is technically challenging, often involving complex architectures or specialized differentiable operations to mimic symbolic steps like variable binding or rule application.

The Neurosymbolic Cycle:

A key practical concept is the neurosymbolic cycle:

Neural networks learn from data (often extracting features or relationships).
Symbolic descriptions or rules are extracted from the trained network (knowledge extraction).
Symbolic reasoning is performed on these descriptions, allowing for extrapolation and complex queries.
The extracted symbolic knowledge can be used as prior knowledge or constraints to guide further neural learning.

Variable Grounding and Symbol Manipulation:

The paper emphasizes the need for symbols to emerge from continuous, distributed representations. Once learned or identified, these symbols can be manipulated precisely and efficiently using symbolic methods. For example, a neural network might learn to identify "container" concepts from images. Once this concept is symbolized, it can be used in logical rules (is_container(X)) that are easier to reason with abstractly and extrapolate than manipulating raw image features.

Practical Applications:

Neurosymbolic AI is particularly well-suited for applications requiring a combination of pattern recognition, reasoning, and interpretation:

Language Understanding and Question Answering: Combining neural networks for understanding text/speech with symbolic systems for logical inference and commonsense reasoning to answer complex questions [51].
Planning: Using neural networks for perception or learning system dynamics, and symbolic planners for generating sequences of actions based on logical rules and goals [29].
Knowledge Engineering: Completing knowledge bases or learning ontologies by combining neural models (like Graph Neural Networks [47] on knowledge graphs) with symbolic reasoning to ensure logical consistency and infer new facts.
Robotics: Integrating perception (neural) with task planning and reasoning about the environment (symbolic).

Explainable AI (XAI):

Knowledge extraction is presented as a crucial component of neurosymbolic XAI. By extracting symbolic rules or descriptions from a trained neural network [15, 90], the system's decision-making process becomes more transparent.

Implementation: Algorithms for knowledge extraction often interrogate the trained network to find patterns or boundaries in its decision surface, translating them into logical rules (e.g., if feature A > threshold and feature B < threshold, then classify as C).
Considerations: The paper stresses the importance of fidelity – how accurately the extracted rules represent the neural network's behavior – as opposed to just producing seemingly plausible explanations that might not reflect the actual model [99]. High fidelity ensures the explanation is truly about the AI system.

Challenges and Implementation Considerations:

The paper identifies several key challenges for implementing neurosymbolic AI:

Integration Complexity: Choosing the right level and method of integration (tight vs. loose coupling) is application-dependent and non-trivial.
Bridging Representations: Effectively mapping between continuous, distributed representations and discrete, localist ones remains an active research area.
Scalability: Ensuring that symbolic reasoning components can scale to large, complex knowledge bases or high-dimensional feature spaces.
Differentiability: Making symbolic operations differentiable to allow end-to-end training with gradient descent is technically difficult for complex logical constructs.
Knowledge Extraction Efficiency and Quality: Extracting compact, accurate, and complete symbolic knowledge from very large neural networks is challenging (Challenge 1).
Efficient Reasoning: Performing efficient commonsense and combinatorial reasoning about learned knowledge is required (Challenge 2).
Human Interaction: Designing effective communication protocols for users to query and understand the system's symbolic knowledge (Challenge 3).

A Practical Recipe (as proposed in the paper):

Train neural networks using gradient descent on data for perception and pattern recognition.
Perform efficient propositional reasoning within the network where possible (e.g., using neural-symbolic methods capable of propositional logic).
Extract rich first-order logic descriptions from the trained network.
Perform complex reasoning (like extrapolation or handling universal quantification) using a separate symbolic reasoning system on the extracted descriptions.
Use the extracted and reasoned symbolic knowledge as constraints or prior information to guide further neural learning.

This practical approach leverages existing strengths of both paradigms, using neural networks for learning and grounding symbols, and symbolic systems for precise reasoning and extrapolation, facilitated by the extraction and injection of knowledge via a symbolic language like first-order logic. Modularity in neural architectures is seen as beneficial for enabling this cycle.

PDF Markdown

Related Papers

Tweets

https://twitter.com/asimunawar/status/1769015684745322835

https://twitter.com/luislamb/status/1802873289510912461

https://twitter.com/AvilaGarcez/status/1932008593357783552

https://twitter.com/luislamb/status/1772386828974231762

YouTube

Show All Videos