HelixFold-Multimer: Elevating Protein Complex Structure Prediction to New Heights (2404.10260v2)
Abstract: While monomer protein structure prediction tools boast impressive accuracy, the prediction of protein complex structures remains a daunting challenge in the field. This challenge is particularly pronounced in scenarios involving complexes with protein chains from different species, such as antigen-antibody interactions, where accuracy often falls short. Limited by the accuracy of complex prediction, tasks based on precise protein-protein interaction analysis also face obstacles. In this report, we highlight the ongoing advancements of our protein complex structure prediction model, HelixFold-Multimer, underscoring its enhanced performance. HelixFold-Multimer provides precise predictions for diverse protein complex structures, especially in therapeutic protein interactions. Notably, HelixFold-Multimer achieves remarkable success in antigen-antibody and peptide-protein structure prediction, greatly surpassing AlphaFold 3. HelixFold-Multimer is now available for public use on the PaddleHelix platform, offering both a general version and an antigen-antibody version. Researchers can conveniently access and utilize this service for their development needs.
- A method for multiple-sequence-alignment-free protein structure prediction using a protein language model. Nature Machine Intelligence, pages 1–10, 2023.
- Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
- Accurate prediction of protein structures and interactions using a three-track neural network. Science, 373(6557):871–876, 2021.
- Zdock: an initial-stage protein-docking algorithm. Proteins: Structure, Function, and Bioinformatics, 52(1):80–87, 2003.
- The hdock server for integrated protein–protein docking. Nature protocols, 15(5):1829–1852, 2020.
- The cluspro web server for protein–protein docking. Nature protocols, 12(2):255–278, 2017.
- The haddock web server for data-driven biomolecular docking. Nature protocols, 5(5):883–897, 2010.
- Protein complex prediction with alphafold-multimer. biorxiv, pages 2021–10, 2021.
- Independent se(3)-equivariant models for end-to-end rigid protein docking. International Conference on Learning Representations.
- Evolutionary-scale prediction of atomic-level protein structure with a language model. Science, 379(6637):1123–1130, 2023.
- Protein complex prediction using rosetta, alphafold, and mass spectrometry covalent labeling. Nature communications, 13(1):7846, 2022.
- Improving deep learning protein monomer and complex structure prediction using deepmsa2 with huge metagenomics data. Nature Methods, pages 1–11, 2024.
- Benchmarking alphafold for protein complex modeling reveals accuracy determinants. Protein Science, 31(8):e4379, 2022.
- Evaluation of alphafold antibody-antigen modeling with implications for improving predictive accuracy. bioRxiv, 2023.
- Improved prediction of protein-protein interactions using alphafold2. Nature communications, 13(1):1265, 2022.
- Helixfold: An efficient implementation of alphafold2 using paddlepaddle. arXiv preprint arXiv:2207.05477, 2022.
- Mmseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nature biotechnology, 35(11):1026–1028, 2017.
- Computed structures of core eukaryotic protein complexes. Science, 374(6573):eabm4805, 2021.
- Efficient and accurate prediction of protein structure using rosettafold2. bioRxiv, pages 2023–05, 2023.
- Dockq: a quality measure for protein-protein docking models. PloS one, 11(8):e0161879, 2016.
- lddt: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics, 29(21):2722–2728, 2013.
- Tm-align: a protein structure alignment algorithm based on the tm-score. Nucleic acids research, 33(7):2302–2309, 2005.
- Protein data bank (pdb): the single global macromolecular structure archive. Protein crystallography: methods and protocols, pages 627–641, 2017.
- Sabdab: the structural antibody database. Nucleic acids research, 42(D1):D1140–D1146, 2014.