Training large transformers using next-token prediction has given rise to groundbreaking advancements in AI. While this generative AI approach has produced impressive results, it heavily leans on human supervision. Even state-of-the-art AI models like ChatGPT depend on fine-tuning through human demonstrations, demanding extensive human input and domain expertise. This strong reliance on human oversight poses a significant hurdle to the advancement of AI innovation. To address this limitation, we propose a novel paradigm termed Exploratory AI (EAI) aimed at autonomously generating high-quality training data. Drawing inspiration from unsupervised reinforcement learning (RL) pretraining, EAI achieves exploration within the natural language space. We accomplish this by harnessing LLMs to assess the novelty of generated content. Our approach employs two key components: an actor that generates novel content following exploration principles and a critic that evaluates the generated content, offering critiques to guide the actor. Empirical evaluations demonstrate that EAI significantly boosts model performance on complex reasoning tasks, addressing the limitations of human-intensive supervision.
We're not able to analyze this paper right now due to high demand.
Please check back later (sorry!).
Generate a detailed summary of this paper with a premium account.
We ran into a problem analyzing this paper.
Anthropic. Introducing claude, 2023. https://www.anthropic.com/index/introducing-claude.
Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality. See https://vicuna.lmsys.org
Free dolly: Introducing the world’s first truly open instruction-tuned llm, 2023. https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm.
K. P. Murphy. Probabilistic Machine Learning: Advanced Topics. MIT Press, 2023. http://probml.github.io/book2.
Chatgpt: Optimizing language models for dialogue. OpenAI Blog, 2022. https://openai.com/blog/chatgpt.