Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 87 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 17 tok/s Pro
GPT-5 High 20 tok/s Pro
GPT-4o 106 tok/s Pro
Kimi K2 156 tok/s Pro
GPT OSS 120B 467 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

CrysFormer: Protein Structure Prediction via 3d Patterson Maps and Partial Structure Attention (2310.03899v1)

Published 5 Oct 2023 in cs.LG

Abstract: Determining the structure of a protein has been a decades-long open question. A protein's three-dimensional structure often poses nontrivial computation costs, when classical simulation algorithms are utilized. Advances in the transformer neural network architecture -- such as AlphaFold2 -- achieve significant improvements for this problem, by learning from a large dataset of sequence information and corresponding protein structures. Yet, such methods only focus on sequence information; other available prior knowledge, such as protein crystallography and partial structure of amino acids, could be potentially utilized. To the best of our knowledge, we propose the first transformer-based model that directly utilizes protein crystallography and partial structure information to predict the electron density maps of proteins. Via two new datasets of peptide fragments (2-residue and 15-residue) , we demonstrate our method, dubbed \texttt{CrysFormer}, can achieve accurate predictions, based on a much smaller dataset size and with reduced computation costs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. Solid state physics. Cengage Learning, 2022.
  2. Protein storytelling through physics. Science, 370(6520):eaaz3041, 2020.
  3. Phaselift: Exact and stable signal recovery from magnitude measurements via convex programming. Communications on Pure and Applied Mathematics, 66(8):1241–1274, 2013.
  4. Phase retrieval via wirtinger flow: Theory and algorithms. IEEE Transactions on Information Theory, 61(4):1985–2007, 2015.
  5. Vit-v-net: Vision transformer for unsupervised volumetric medical image registration. arXiv, .2104.06468, 2021. URL https://doi.org/10.48550/arXiv.2104.06468.
  6. Utilizing information bottleneck to evaluate the capability of deep neural networks for image classification. Entropy, 21(5), 2019. ISSN 1099-4300. doi: 10.3390/e21050456. URL https://www.mdpi.com/1099-4300/21/5/456.
  7. Kevin Cowtan. cphasematch, 2011. URL https://www.ccp4.ac.uk/html/cphasematch.html.
  8. P. R. David and S. Subbiah. Low-resolution real-space envelopes: the application of the condensing-protocol approach to the ab initio macromolecular phase problem of a variety of examples. Acta Crystallographica Section D, 50(2):132–138, Mar 1994. doi: 10.1107/S090744499301131X. URL https://doi.org/10.1107/S090744499301131X.
  9. Jan Drenth. Principles of protein X-ray crystallography. Springer Science & Business Media, 2007.
  10. Openmm 7: Rapid development of high performance algorithms for molecular dynamics. PLoS computational biology, 13(7):e1005659, 2017.
  11. J. R. Fienup. Phase retrieval algorithms: a comparison. Appl. Opt., 21(15):2758–2769, Aug 1982. doi: 10.1364/AO.21.002758.
  12. Deep phase retrieval for astronomical Shack–Hartmann wavefront sensors. Monthly Notices of the Royal Astronomical Society, 510(3):4347–4354, 12 2021. ISSN 0035-8711. doi: 10.1093/mnras/stab3690.
  13. Direct phasing of protein crystals with high solvent content. Acta Crystallographica Section A, 71(1):92–98, Jan 2015. doi: 10.1107/S2053273314024097.
  14. Improving the efficiency of molecular replacement by utilizing a new iterative transform phasing algorithm. Acta Crystallographica Section A: Foundations and Advances, 72(5):539–547, 2016.
  15. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In 2015 IEEE International Conference on Computer Vision (ICCV), pp.  1026–1034, New York, NY, USA, 2015. IEEE Press. doi: 10.1109/ICCV.2015.123.
  16. Anne Marie Helmenstine. Amino acid chirality, 2021. URL https://www.thoughtco.com/amino-acid-chirality-4009939.
  17. Squeeze-and-excitation networks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  7132–7141, New York, NY, USA, 2018. IEEE Press. doi: 10.1109/CVPR.2018.00745.
  18. David Hurwitz. From patterson maps to atomic coordinates: Training a deep neural network to solve the phase problem for a simplified case. arXiv, 03 2020.
  19. Resolution dependence of an ab initio phasing method in protein x-ray crystallography. Crystals, 8(4), 2018. ISSN 2073-4352. doi: 10.3390/cryst8040156. URL https://www.mdpi.com/2073-4352/8/4/156.
  20. Molecular-replacement phasing using predicted protein structures from awsem-suite. IUCrJ, 7(6):1168–1178, 2020.
  21. Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
  22. Ptychnet: Cnn based fourier ptychography. In 2017 IEEE International Conference on Image Processing (ICIP), pp.  1712–1716, New York, NY, USA, 2017. IEEE Press.
  23. A general method for directly phasing diffraction data from high-solvent-content protein crystals. IUCrJ, 9(5), 2022.
  24. Protein Crystallography. Johns Hopkins University Press, 2008.
  25. Fnet: Mixing tokens with fourier transforms, 2022.
  26. Macromolecular structure determination using x-rays, neutrons and electrons: recent developments in phenix. Acta Crystallogr., D75(10):861–877, Oct 2019. doi: 10.1107/S2059798319011471. URL https://doi.org/10.1107/S2059798319011471.
  27. A deep learning solution for crystallographic structure determination. IUCrJ, 10(4):487–496, 2023.
  28. A. L. Patterson. A fourier series method for the determination of the components of interatomic distances in crystals. Phys. Rev., 46:372–376, Sep 1934. doi: 10.1103/PhysRev.46.372.
  29. RJ Read and AJ Schierbeek. A phased translation function. Journal of Applied Crystallography, 21(5):490–495, 1988.
  30. Phase recovery and holographic image reconstruction using deep learning in neural networks. Light: Science & Applications, 7(2):17141–17141, 2018.
  31. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pp.  234–241, 2015.
  32. I-TASSER: a unified platform for automated protein structure and function prediction. Nature protocols, 5(4):725–738, 2010.
  33. Comparative protein modelling by satisfaction of spatial restraints. Journal of molecular biology, 234(3):779–815, 1993.
  34. Manfred J Sippl. Calculation of conformational ensembles from potentials of mena force: an approach to the knowledge-based prediction of local structures in globular proteins. Journal of molecular biology, 213(4):859–883, 1990.
  35. Nature’s Robots: A History of Proteins. Oxford University Press, 2004.
  36. Iterative model building, structure refinement and density modification with the PHENIX AutoBuild wizard. Acta Crystallogr., D64(1):61–69, Jan 2008. doi: 10.1107/S090744490705024X. URL https://doi.org/10.1107/S090744490705024X.
  37. Alphafold predictions are valuable hypotheses, and accelerate but do not replace experimental structure determination. bioRxiv, 2023. doi: 10.1101/2022.11.21.517405. URL https://www.biorxiv.org/content/early/2023/05/19/2022.11.21.517405.
  38. Highly accurate protein structure prediction for the human proteome. Nature, 596(7873):590–596, 2021.
  39. An introduction to experimental phasing of macromolecules illustrated by SHELX; new autotracing features. Acta Crystallogr., D74(2):106–116, Feb 2018. doi: 10.1107/S2059798317015121. URL https://doi.org/10.1107/S2059798317015121.
  40. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  41. Overview of the CCP4 suite and current developments. Acta Crystallographica Section D, 67(4):235–242, Apr 2011. doi: 10.1107/S0907444910045749.
  42. Marcin Wojdyr. Gemmi: A library for structural biology. Journal of Open Source Software, 7(73):4200, 2022.
  43. wwPDB consortium. Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Research, 47(D1):D520–D528, 2019. doi: 10.1093/nar/gky949. URL https://doi.org/10.1093/nar/gky949.
  44. Nyströmformer: A nyström-based algorithm for approximating self-attention. Proceedings of the AAAI Conference on Artificial Intelligence, 35(16):14138–14148, May 2021. doi: 10.1609/aaai.v35i16.17664. URL https://ojs.aaai.org/index.php/AAAI/article/view/17664.
  45. U-net-based medical image segmentation algorithm. In 13th International Conference on Wireless Communications and Signal Processing (WCSP), pp.  1–5, 2021. doi: 10.1109/WCSP52459.2021.9613447.
  46. Gerchberg–saxton algorithm applied in the fractional fourier or the fresnel domain. Optics Letters, 21(12):842–844, 1996.
Citations (2)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.