GPT-4 as an interface between researchers and computational software: improving usability and reproducibility (2310.11458v1)
Abstract: LLMs are playing an increasingly important role in science and engineering. For example, their ability to parse and understand human and computer languages makes them powerful interpreters and their use in applications like code generation are well-documented. We explore the ability of the GPT-4 LLM to ameliorate two major challenges in computational materials science: i) the high barriers for adoption of scientific software associated with the use of custom input languages, and ii) the poor reproducibility of published results due to insufficient details in the description of simulation methods. We focus on a widely used software for molecular dynamics simulations, the Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS), and quantify the usefulness of input files generated by GPT-4 from task descriptions in English and its ability to generate detailed descriptions of computational tasks from input files. We find that GPT-4 can generate correct and ready-to-use input files for relatively simple tasks and useful starting points for more complex, multi-step simulations. In addition, GPT-4's description of computational tasks from input files can be tuned from a detailed set of step-by-step instructions to a summary description appropriate for publications. Our results show that GPT-4 can reduce the number of routine tasks performed by researchers, accelerate the training of new users, and enhance reproducibility.
- Progress and prospects for accelerating materials science with automated and autonomous workflows. Chemical science, 10(42):9640–9649, 2019.
- Next-generation experimentation with self-driving laboratories. Trends in Chemistry, 1(3):282–291, 2019.
- Gaussian approximation potentials: The accuracy of quantum mechanics, without the electrons. Physical review letters, 104(13):136403, 2010.
- Parsimonious neural networks learn interpretable physical laws. Scientific reports, 11(1):12761, 2021.
- Beyond generative models: superfast traversal, optimization, novelty, exploration and discovery (stoned) algorithm for molecules using selfies. Chemical science, 12(20):7079–7090, 2021.
- Accelerated search for materials with targeted properties by adaptive design. Nature communications, 7(1):1–9, 2016.
- Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics, 378:686–707, 2019.
- Is chatgpt a general-purpose natural language processing task solver? arXiv preprint arXiv:2302.06476, 2023.
- Using artificial intelligence to extract information on pathogen characteristics from scientific publications. International Journal of Hygiene and Environmental Health, 245:114018, 2022.
- Extracting accurate materials data from research papers with conversational language models and prompt engineering–example of chatgpt. arXiv preprint arXiv:2303.05352, 2023.
- Flexible, model-agnostic method for materials data extraction from text using general purpose language models. arXiv preprint arXiv:2302.04914, 2023.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- OpenAI. Gpt-4 technical report. arXiv preprint arXiv:2303.08774v3, 2023.
- Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712, 2023.
- Improving language understanding by generative pre-training. 2018.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- Zijian Hong. Chatgpt for computational materials science: A perspective. Energy Material Advances, 4, 1 2023.
- Steve Plimpton. Fast parallel algorithms for short-range molecular dynamics. Journal of computational physics, 117(1):1–19, 1995.
- 2 - getting to know your data. In Jiawei Han, Micheline Kamber, and Jian Pei, editors, Data Mining (Third Edition), The Morgan Kaufmann Series in Data Management Systems, pages 39–82. Morgan Kaufmann, Boston, third edition edition, 2012.
- Training data selection for accuracy and transferability of interatomic potentials. npj Computational Materials, 8(1):189, 2022.
- A Strachan and CO Dorso. Caloric curve in fragmentation. Physical Review C, 58(2):R632, 1998.
- Interatomic potentials for monoatomic metals from experimental data and ab initio calculations. Physical Review B, 59(5):3393, 1999.
- The potential of atomistic simulations and the knowledgebase of interatomic models. Jom, 63(7):17, 2011.
- Interatomic potentials from first-principles calculations: the force-matching method. Europhysics Letters, 26(8):583, 1994.
- Juan C. Verduzco (5 papers)
- Ethan Holbrook (2 papers)
- Alejandro Strachan (56 papers)