Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 165 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 27 tok/s Pro
GPT-4o 64 tok/s Pro
Kimi K2 183 tok/s Pro
GPT OSS 120B 432 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Tx-LLM: A Large Language Model for Therapeutics (2406.06316v1)

Published 10 Jun 2024 in cs.CL, cs.AI, cs.CE, and cs.LG

Abstract: Developing therapeutics is a lengthy and expensive process that requires the satisfaction of many different criteria, and AI models capable of expediting the process would be invaluable. However, the majority of current AI approaches address only a narrowly defined set of tasks, often circumscribed within a particular domain. To bridge this gap, we introduce Tx-LLM, a generalist LLM fine-tuned from PaLM-2 which encodes knowledge about diverse therapeutic modalities. Tx-LLM is trained using a collection of 709 datasets that target 66 tasks spanning various stages of the drug discovery pipeline. Using a single set of weights, Tx-LLM simultaneously processes a wide variety of chemical or biological entities(small molecules, proteins, nucleic acids, cell lines, diseases) interleaved with free-text, allowing it to predict a broad range of associated properties, achieving competitive with state-of-the-art (SOTA) performance on 43 out of 66 tasks and exceeding SOTA on 22. Among these, Tx-LLM is particularly powerful and exceeds best-in-class performance on average for tasks combining molecular SMILES representations with text such as cell line names or disease names, likely due to context learned during pretraining. We observe evidence of positive transfer between tasks with diverse drug types (e.g.,tasks involving small molecules and tasks involving proteins), and we study the impact of model size, domain finetuning, and prompting strategies on performance. We believe Tx-LLM represents an important step towards LLMs encoding biochemical knowledge and could have a future role as an end-to-end tool across the drug discovery development pipeline.

Citations (6)

Summary

  • The paper presents Tx-LLM, which fine-tunes PaLM-2 on 709 datasets across 66 therapeutic tasks to advance drug discovery and development.
  • It employs a mix of zero-shot and few-shot learning techniques using SMILES strings, amino acid sequences, and text inputs for diverse data integration.
  • Tx-LLM achieves competitive to superior performance on 43 tasks, surpassing state-of-the-art benchmarks on 22 tasks and demonstrating positive cross-domain transfer.

Tx-LLM: A LLM for Therapeutics

Introduction

The paper introduces Tx-LLM, an innovative approach towards enhancing the drug discovery and development pipeline by leveraging LLMs. Tx-LLM is a fine-tuned variant of PaLM-2, designed to encode knowledge across diverse therapeutic modalities. The model is trained on a collection of 709 datasets covering 66 tasks, which span various stages of drug discovery and development, offering a generalist model capable of addressing a wide array of tasks with a single set of weights. Figure 1

Figure 1: Overview of Tx-LLM, showcasing the integration of diverse datasets and the training approach using TxT.

Methodology

Datasets

Tx-LLM leverages the Therapeutics Data Commons (TDC) to curate a comprehensive dataset, TxT, which includes information from small molecules, proteins, nucleic acids, and more. These datasets are formatted with specific instructions, context, questions, and answers to facilitate the fine-tuning process. Tasks are classified into binary classification, regression, and generation categories, utilizing representations such as SMILES strings and amino acid sequences.

Modeling and Training

Tuition of Tx-LLM begins with fine-tuning the PaLM-2 model using TxT. The training process involves a mixture of zero-shot and few-shot learning techniques, with prompts containing varying numbers of shots selected randomly or based on nearest neighbor datapoints. Performance is evaluated against state-of-the-art (SOTA) metrics for each dataset to gauge the model's efficacy.

Results

Tx-LLM demonstrated competitive to superior performance on 43 out of 66 tasks, surpassing SOTA in 22 tasks. The model particularly excels in datasets where molecular SMILES representations are combined with textual information, leveraging the context learned during pretraining. Figure 2

Figure 2: Comparison of Tx-LLM's performance against SOTA, illustrating its capabilities across multiple datasets.

Positive transfer between datasets of diverse drug types was evident, as training on broader datasets yielded improvements on molecular-specific tasks. The model also benefits from scaling, with larger variants outperforming smaller ones, and domain-specific fine-tuning contributing positively to the results. Figure 3

Figure 3: Demonstrates evidence of positive transfer across datasets with diverse drug types.

Discussion

The implications of Tx-LLM are profound, offering a unified model that can address a multitude of tasks across the therapeutic development pipeline. The positive transfer observed suggests that LLMs can serve as generalist models proficient in handling diverse biochemical data forms. However, challenges remain, particularly in representing small molecule datasets and ensuring data contamination is managed.

Tx-LLM's design and training methodology provide valuable insights into the potential for LLMs in drug development, particularly when equipped with carefully curated and diverse datasets. The model's ability to leverage text and sequence data effectively positions it as a promising tool for end-to-end applications in therapeutic development.

Conclusion

Tx-LLM represents a significant advancement in the use of AI for therapeutic development, illustrating the potential for LLMs to streamline various stages of the drug discovery pipeline. With further development and validation, Tx-LLM could play a crucial role in reducing the time and cost associated with bringing new therapeutics to market. The model's success underscores the importance of integrating diverse data types and leveraging LLMs' capabilities in understanding and predicting complex biochemical interactions.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Explain it Like I'm 14

Plain-language summary of “Tx-LLM: A LLM for Therapeutics”

What is this paper about?

The paper introduces Tx-LLM, a large AI model designed to help with drug discovery. Instead of building lots of small, separate tools for each step in finding and testing new medicines, Tx-LLM is one general model that can handle many different tasks using the same set of learned skills.

What questions did the researchers ask?

In simple terms, they wanted to know:

  • Can one big AI model learn enough about chemistry and biology to help with many steps in drug development?
  • Will training it on many kinds of data (like drug molecules, protein sequences, and disease names) make it better at each individual task?
  • How do things like model size, the way it’s trained, and how we ask it questions affect its performance?

How did they try to answer these questions?

They built a training collection called TxT using 709 datasets that cover 66 different tasks from across the drug discovery pipeline (for example, predicting whether a drug is toxic, whether it binds to a target, how well it might work, and even whether a clinical trial could succeed).

To make this understandable to a LLM, they turned all inputs into text:

  • Small molecules were written as SMILES strings, which are like special “codes” that describe a molecule.
  • Proteins and genes were written as their letter sequences (amino acids or nucleotides).
  • Extra details (like disease names or cell line descriptions) stayed as regular text.

They then fine-tuned an existing LLM (PaLM‑2) so it could read prompts that look a bit like short quiz questions. Each prompt had:

  • An instruction (what to do),
  • A short context (why this matters),
  • A question (the task itself, mixing text with molecule or protein strings),
  • And an answer (for example, yes/no, a number, or generated text).

Some tasks were:

  • Classification (yes/no, like “Is this drug toxic?”),
  • Regression (a number, like “How strong is the binding?”; they turned numbers into “bins” from 000 to 1000 so the model could predict a label),
  • Generation (like predicting reactants from a product in a chemical reaction).

They tested different ways of asking the model questions (with zero examples or a few examples included in the prompt), compared different model sizes, and checked how much domain-specific training helped.

What did they discover?

Here are the main results:

  • Strong overall performance: Out of 66 tasks, Tx-LLM matched or beat the best published results on 43. It set new best results on 22 of them, all while using one single model for everything.
  • Best when mixing molecules with text: Tasks that combined a molecule (SMILES) plus text (like disease or cell line names) worked especially well. This likely benefits from the model’s general language knowledge learned before fine-tuning.
  • Learning transfers across types: Training on very different drug types (like proteins and small molecules) helped the model do better overall. In other words, learning about protein sequences also improved predictions about small molecules—this is called positive transfer.
  • Bigger and more specialized training helps: Larger model versions did better, and fine-tuning on therapeutic data clearly improved results compared to the base (unadapted) model.
  • Context matters: Adding helpful background sentences in the prompt improved performance for most tasks.
  • Few-shot vs zero-shot: Adding a few examples to the prompt didn’t consistently beat zero examples; both worked similarly overall.
  • Limits: On tasks that use only SMILES strings (no extra text), some specialist models that represent molecules as graphs still did better. Also, LLMs can sometimes write invalid SMILES strings, which is like making up impossible molecules.

Why is this important?

If one AI model can handle many different steps in drug discovery, it could help scientists move faster and spend less money by:

  • Screening out weak or risky drug candidates earlier,
  • Prioritizing which experiments to run in the lab,
  • Linking together knowledge about chemistry, biology, and diseases in one place.

That said, Tx-LLM is still research. It doesn’t replace real experiments, and it isn’t perfect on every task. It also wasn’t tuned to explain its answers in natural language, and care is needed to avoid mistakes like invalid molecule strings. Still, this work is a promising step toward an “all-in-one” AI assistant that supports many stages of medicine development—from picking targets to predicting clinical trial success.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 9 tweets and received 384 likes.

Upgrade to Pro to view all of the tweets about this paper: