Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 71 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 18 tok/s Pro
GPT-5 High 15 tok/s Pro
GPT-4o 101 tok/s Pro
Kimi K2 196 tok/s Pro
GPT OSS 120B 467 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Atom-by-atom protein generation and beyond with language models (2308.09482v1)

Published 16 Aug 2023 in q-bio.BM and cs.LG

Abstract: Protein LLMs learn powerful representations directly from sequences of amino acids. However, they are constrained to generate proteins with only the set of amino acids represented in their vocabulary. In contrast, chemical LLMs learn atom-level representations of smaller molecules that include every atom, bond, and ring. In this work, we show that chemical LLMs can learn atom-level representations of proteins enabling protein generation unconstrained to the standard genetic code and far beyond it. In doing so, we show that LLMs can generate entire proteins atom by atom -- effectively learning the multiple hierarchical layers of molecular information that define proteins from their primary sequence to their secondary, and tertiary structure. We demonstrate LLMs are able to explore beyond protein space -- generating proteins with modified sidechains that form unnatural amino acids. Even further, we find that LLMs can explore chemical space and protein space simultaneously and generate novel examples of protein-drug conjugates. The results demonstrate the potential for biomolecular design at the atom level using LLMs.

Citations (1)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.