Emergent Mind

Understanding Biology in the Age of Artificial Intelligence

Published Mar 6, 2024 in cs.AI


Modern life sciences research is increasingly relying on artificial intelligence approaches to model biological systems, primarily centered around the use of ML models. Although ML is undeniably useful for identifying patterns in large, complex data sets, its widespread application in biological sciences represents a significant deviation from traditional methods of scientific inquiry. As such, the interplay between these models and scientific understanding in biology is a topic with important implications for the future of scientific research, yet it is a subject that has received little attention. Here, we draw from an epistemological toolkit to contextualize recent applications of ML in biological sciences under modern philosophical theories of understanding, identifying general principles that can guide the design and application of ML systems to model biological phenomena and advance scientific knowledge. We propose that conceptions of scientific understanding as information compression, qualitative intelligibility, and dependency relation modelling provide a useful framework for interpreting ML-mediated understanding of biological systems. Through a detailed analysis of two key application areas of ML in modern biological research - protein structure prediction and single cell RNA-sequencing - we explore how these features have thus far enabled ML systems to advance scientific understanding of their target phenomena, how they may guide the development of future ML models, and the key obstacles that remain in preventing ML from achieving its potential as a tool for biological discovery. Consideration of the epistemological features of ML applications in biology will improve the prospects of these methods to solve important problems and advance scientific understanding of living systems.


  • The paper discusses the significant role of AI, especially Machine Learning (ML) models, in understanding complex biological systems, emphasizing a shift from traditional methodologies to data-driven approaches.

  • It explores the epistemological aspects of applying ML to biology, highlighting the need for an inductive approach due to the multidimensional and emergent properties of biological systems.

  • The success of ML in areas such as protein structure prediction with AlphaFold2 and in single-cell RNA sequencing (scRNA-seq) analysis is examined, showcasing the potential of AI to reveal new insights into biological processes.

  • The paper considers the future of AI in biology, focusing on epistemological foundations and the promise of AI to enhance scientific understanding and challenge existing conceptualizations.

Understanding Biology in the Age of Artificial Intelligence: A Formal Analysis


The application of AI, particularly Machine Learning (ML) models, has significantly enhanced our ability to model and understand complex biological systems. The increasing integration of AI into biological research marks a pivotal shift from traditional scientific methodologies towards more data-driven approaches. This trend presents both opportunities and challenges in enhancing scientific understanding and necessitates a careful examination of the epistemological underpinnings and implications of leveraging AI for biological discoveries.

Scientific Understanding and ML Models

Understanding complex biological phenomena through AI entails a departure from traditional deductive-nomological models of explanation. Unlike the law-based explanations prevalent in physics, biological systems exhibit multidimensionality, conditionality, and emergence, which inherently resist reductive explanation. These features necessitate an inductive approach, where patterns and dependencies within large datasets are identified. The epistemological aspects of ML applications in biology, framed within theories of information compression, qualitative intelligibility, and dependency relation modeling, provide a valuable perspective for assessing the contribution of AI to biological sciences.

Machine Learning in Protein Structure Prediction

In the domain of protein structure prediction, the development and success of the AlphaFold2 system highlight the significant potential of ML approaches. AlphaFold2 employs a sophisticated architecture that leverages deep neural networks to accurately predict protein structures. This system demonstrates an advanced form of what could be considered an understanding of protein structure, through its capacity to model crucial dependency relations and effectively compress relevant information. The implications of AlphaFold2 encompass both practical benefits for drug discovery and theoretical advances in our comprehension of protein folding dynamics, suggesting a promising avenue for future developments in AI-driven biological discovery.

Machine Learning in Single-cell RNA Sequencing

The application of ML models in single-cell RNA sequencing (scRNA-seq) analysis exemplifies another critical area where AI is reshaping our understanding of biological systems. Through techniques encompassing dimensionality reduction, clustering, and trajectory inference, ML models facilitate the exploration of cellular heterogeneity and dynamic biological processes at an unprecedented scale and depth. These models embody the principles of information compression and dependency relation modeling, enabling novel insights into cellular functions, states, and transitions. Nevertheless, the interpretation of ML models in scRNA-seq poses challenges, underscoring the importance of aligning model assumptions with biological realities to ensure meaningful contributions to scientific knowledge.

Epistemological Considerations and Future Directions

As AI continues to establish itself as an integral tool for biological research, reflecting on the epistemological foundations of how these models contribute to scientific understanding is paramount. The frameworks of information compression, qualitative intelligibility, and dependency relation modeling offer valuable lenses through which to evaluate the efficacy and limitations of AI in advancing our knowledge of complex biological systems. Looking forward, the iterative co-evolution of AI technologies and biological sciences promises not only to enhance our understanding of life but also to challenge and refine our conceptualizations of understanding itself within the realm of scientific inquiry.

In summary, the integration of AI into biological research represents a transformative paradigm shift with profound implications for scientific discovery. By bridging the gap between data-driven models and traditional epistemological approaches to understanding, AI holds the potential to unravel the intricacies of life's machinery and, in doing so, redefine the boundaries of human knowledge.

Create an account to read this summary for free:


Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.