Emergent Mind

Abstract

In this paper, we present our finding that prepending a Task-Agnostic Prefix Prompt (TAPP) to the input improves the instruction-following ability of various LLMs during inference. TAPP is different from canonical prompts for LLMs in that it is a fixed prompt prepended to the beginning of every input regardless of the target task for zero-shot generalization. We observe that both base LLMs (i.e. not fine-tuned to follow instructions) and instruction-tuned models benefit from TAPP, resulting in 34.58% and 12.26% improvement on average, respectively. This implies that the instruction-following ability of LLMs can be improved during inference time with a fixed prompt constructed with simple heuristics. We hypothesize that TAPP assists language models to better estimate the output distribution by focusing more on the instruction of the target task during inference. In other words, such ability does not seem to be sufficiently activated in not only base LLMs but also many instruction-fine-tuned LLMs. All experiments are reproducible from https://github.com/seonghyeonye/TAPP.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a detailed summary of this paper with a premium account.

We ran into a problem analyzing this paper.

Subscribe by Email

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.

References
  1. What learning algorithm is in-context learning? Investigations with linear models
  2. A General Language Assistant as a Laboratory for Alignment
  3. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
  4. GPT-NeoX-20B: An Open-Source Autoregressive Language Model
  5. Language models are few-shot learners. Advances in neural information processing systems.
  6. PaLM: Scaling Language Modeling with Pathways
  7. Scaling Instruction-Finetuned Language Models
  8. Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers
  9. SimCSE: Simple Contrastive Learning of Sentence Embeddings
  10. What Can Transformers Learn In-Context? A Case Study of Simple Function Classes
  11. Studying Large Language Model Generalization with Influence Functions
  12. news-please: A Generic News Crawler and Extractor. In Proceedings of the 15th International Symposium of Information Science, 218–223.
  13. Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor
  14. OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
  15. Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
  16. What Makes Good In-Context Examples for GPT-3? In Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. Association for Computational Linguistics.
  17. GPT Understands, Too
  18. Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations
  19. Text and Patterns: For Effective Chain of Thought, It Takes Two to Tango
  20. MetaICL: Learning to Learn In Context. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics.
  21. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
  22. OpenAI. 2022. Chatgpt: Optimizing language models for dialogue.
  23. GPT-4 Technical Report
  24. Training language models to follow instructions with human feedback
  25. Learning To Retrieve Prompts for In-Context Learning
  26. Multitask Prompted Training Enables Zero-Shot Task Generalization
  27. Large language models can be easily distracted by irrelevant context. In International Conference on Machine Learning, 31210–31227. PMLR.
  28. LLaMA: Open and Efficient Foundation Language Models
  29. Transformers learn in-context by gradient descent
  30. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax.

  31. Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters
  32. Self-Instruct: Aligning Language Models with Self-Generated Instructions
  33. Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
  34. Do Prompt-Based Models Really Understand the Meaning of their Prompts?
  35. Finetuned Language Models Are Zero-Shot Learners
  36. INSTRUCTSCORE: Explainable Text Generation Evaluation with Finegrained Feedback
  37. Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
  38. OPT: Open Pre-trained Transformer Language Models

Show All 38