Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 27 tok/s Pro
GPT-4o 84 tok/s Pro
Kimi K2 192 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

A New Approach Towards Autoformalization (2310.07957v3)

Published 12 Oct 2023 in cs.CL and cs.AI

Abstract: Verifying mathematical proofs is difficult, but can be automated with the assistance of a computer. Autoformalization is the task of automatically translating natural language mathematics into a formal language that can be verified by a program. This is a challenging task, and especially for higher-level mathematics found in research papers. Research paper mathematics requires large amounts of background and context. In this paper, we propose an avenue towards tackling autoformalization for research-level mathematics, by breaking the task into easier and more approachable subtasks: unlinked formalization (formalization with unlinked definitions and theorems), entity linking (linking to the proper theorems and definitions), and finally adjusting types so it passes the type checker. In addition, we present arXiv2Formal, a benchmark dataset for unlinked formalization consisting of 50 theorems formalized for the Lean theorem prover sampled from papers on arXiv.org. We welcome any contributions from the community to future versions of this dataset.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Towards a mathematics formalisation assistant using large language models. (arXiv:2211.07524), Nov 2022. URL http://arxiv.org/abs/2211.07524. arXiv:2211.07524 [cs].
  2. Proofnet: A benchmark for autoformalizing and formally proving undergraduate-level mathematics problems. NeurIPS 2022.
  3. Language models are few-shot learners, 2020.
  4. Davide Castelvecchi et al. Mathematicians welcome computer-assisted proof in ‘grand unification’theory. Nature, 595(7865):18–19, 2021.
  5. Evaluating large language models trained on code. CoRR, abs/2107.03374, 2021. URL https://arxiv.org/abs/2107.03374.
  6. Mathlib Community. Completion of the liquid tensor experiment, Jul 2022. URL https://leanprover-community.github.io/blog/posts/lte-final/.
  7. The lean theorem prover (system description). In CADE, 2015. URL https://api.semanticscholar.org/CorpusID:232990.
  8. Chelsea Edmonds. Hypergraphs. Archive of Formal Proofs, September 2023. ISSN 2150-914x. https://isa-afp.org/entries/Hypergraph_Basics.html, Formal proof development.
  9. Towards automating formalisation of theorem statements using large language models.
  10. Proof artifact co-training for theorem proving with language models. (arXiv:2102.06203), Mar 2022. URL http://arxiv.org/abs/2102.06203. arXiv:2102.06203 [cs].
  11. Measuring mathematical problem solving with the math dataset. arXiv preprint arXiv: 2103.03874, 2021.
  12. Draft, sketch, and prove: Guiding formal theorem provers with informal proofs. (arXiv:2210.12283), Feb 2023. URL http://arxiv.org/abs/2210.12283. arXiv:2210.12283 [cs].
  13. Generalized brjuno functions associated to α𝛼\alphaitalic_α-continued fractions. arXiv preprint arXiv: Arxiv-0705.1690, 2007.
  14. Dana Mackenzie. The poincaré conjecture–proved. Science, 314(5807):1848–1849, 2006. doi: 10.1126/science.314.5807.1848. URL https://www.science.org/doi/abs/10.1126/science.314.5807.1848.
  15. The mathlib Community. The lean mathematical library. In Proceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs, CPP 2020, page 367–381, New York, NY, USA, 2020. Association for Computing Machinery. ISBN 9781450370974. doi: 10.1145/3372885.3373824. URL https://doi.org/10.1145/3372885.3373824.
  16. OpenAI. Chatgpt, 2020. URL https://www.openai.com/research/chatgpt.
  17. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA, July 2002. Association for Computational Linguistics. doi: 10.3115/1073083.1073135. URL https://aclanthology.org/P02-1040.
  18. Peter Scholze. Liquid tensor experiment, Dec 2020. URL https://xenaproject.wordpress.com/2020/12/05/liquid-tensor-experiment/.
  19. A study of translation edit rate with targeted human annotation. In Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, pages 223–231, Cambridge, Massachusetts, USA, August 8-12 2006. Association for Machine Translation in the Americas. URL https://aclanthology.org/2006.amta-papers.25.
  20. Christian Szegedy, editor. A Promising Path Towards Autoformalization and General Artificial Intelligence, 2020.
  21. Autoformalization with large language models. (arXiv:2205.12615), May 2022. URL http://arxiv.org/abs/2205.12615. arXiv:2205.12615 [cs].
  22. Leandojo: Theorem proving with retrieval-augmented language models. arXiv preprint arXiv: 2306.15626, 2023.
  23. Minif2f: a cross-system benchmark for formal olympiad-level mathematics. arXiv preprint arXiv:2109.00110, 2021.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper:

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube