1.5 million materials narratives generated by chatbots (2308.13687v1)

Published 25 Aug 2023 in cond-mat.mtrl-sci and cs.CL

Abstract: The advent of AI has enabled a comprehensive exploration of materials for various applications. However, AI models often prioritize frequently encountered materials in the scientific literature, limiting the selection of suitable candidates based on inherent physical and chemical properties. To address this imbalance, we have generated a dataset of 1,494,017 natural language-material paragraphs based on combined OQMD, Materials Project, JARVIS, COD and AFLOW2 databases, which are dominated by ab initio calculations and tend to be much more evenly distributed on the periodic table. The generated text narratives were then polled and scored by both human experts and ChatGPT-4, based on three rubrics: technical accuracy, language and structure, and relevance and depth of content, showing similar scores but with human-scored depth of content being the most lagging. The merger of multi-modality data sources and LLM holds immense potential for AI frameworks to help the exploration and discovery of solid-state materials for specific applications.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

1.5 million materials narratives generated by chatbots (2308.13687v1)

Summary

Related Papers