Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 157 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 35 tok/s Pro
GPT-5 High 31 tok/s Pro
GPT-4o 97 tok/s Pro
Kimi K2 218 tok/s Pro
GPT OSS 120B 450 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

Prompt Generate Train (PGT): Few-shot Domain Adaption of Retrieval Augmented Generation Models for Open Book Question-Answering (2307.05915v2)

Published 12 Jul 2023 in cs.LG

Abstract: We propose a framework - Prompt, Generate, Train (PGT) - to efficiently develop a generative question-answering model for open-book question-answering over a proprietary collection of text documents. The framework adapts a retriever augmented generation (RAG) model to the target domain using supervised fine-tuning and reinforcement learning with synthetic feedback in a few-shot setting. This, we hypothesize, will yield an aligned, uncertainty calibrated model that is competitive with GPT-4 based in-context retrieval augmented generation in generating relevant answers at lower serving costs. The framework's synthetic generation pipeline will generate synthetic training data comprising <passage, question, answer> tuples using an open-source LLM and a novel consistency filtering scheme. The pipeline will be designed to generate both abstractive and extractive questions that span the entire corpus. The framework proposes to fine-tune a smaller RAG model comprising a dense retriever (ColBERTv2) and a smaller sized LLM on the synthetic dataset. In parallel, the framework will train a Reward model to score domain grounded answers higher than hallucinated answers using an a priori relevance ordering of synthetically assembled samples. In the next phase, the framework will align the RAG model with the target domain using reinforcement learning (Proximal Policy Optimization). This step may improve the RAG model's ability to generate grounded answers and ignore out of domain questions. In the final phase, the framework will calibrate the model's uncertainty for extractive question-answers.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.