Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

UniICL: An Efficient Unified Framework Unifying Compression, Selection, and Generation (2405.17062v3)

Published 27 May 2024 in cs.CL

Abstract: In-context learning (ICL) enhances the reasoning abilities of LLMs by prepending a few demonstrations. It motivates researchers to introduce more examples to provide additional contextual information for the generation. However, existing methods show a significant limitation due to the problem of excessive growth in context length, which causes a large hardware burden. In addition, shallow-relevant examples selected by off-the-shelf tools hinder LLMs from capturing useful contextual information for generation. In this paper, we propose \textbf{UniICL}, a novel \textbf{Uni}fied \textbf{ICL} framework that unifies demonstration compression, demonstration selection, and final response generation. Furthermore, to boost inference efficiency, we design a tailored compression strategy that allows UniICL to cache compression results into \textbf{Demonstration Bank} (\textbf{DB}), which avoids repeated compression of the same demonstration. Extensive out-of-domain evaluations prove the advantages of UniICL in both effectiveness and efficiency.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Token merging: Your vit but faster. arXiv preprint arXiv:2210.09461.
  2. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  3. Adapting language models to compress contexts. arXiv preprint arXiv:2305.14788.
  4. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  5. In-context autoencoder for context compression in a large language model. arXiv preprint arXiv:2307.06945.
  6. Cicero: A dataset for contextualized commonsense inference in dialogues. arXiv preprint arXiv:2203.13926.
  7. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738.
  8. Llmlingua: Compressing prompts for accelerated inference of large language models. arXiv preprint arXiv:2310.05736.
  9. Learned token pruning for transformers. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 784–794.
  10. Yucheng Li. 2023. Unlocking context constraints of llms: Enhancing context efficiency of llms with self-information-based content filtering. arXiv preprint arXiv:2304.12102.
  11. Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81.
  12. What makes good in-context examples for gpt-3333? arXiv preprint arXiv:2101.06804.
  13. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  14. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. arXiv preprint arXiv:2104.08786.
  15. Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies, pages 142–150.
  16. Noisy channel language model prompting for few-shot text classification. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5316–5330.
  17. Learning to compress prompts with gist tokens. arXiv preprint arXiv:2304.08467.
  18. Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. ArXiv, abs/1808.08745.
  19. Ms marco: A human generated machine reading comprehension dataset. choice, 2640:660.
  20. In-context retrieval-augmented language models. arXiv preprint arXiv:2302.00083.
  21. Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.
  22. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing, pages 1631–1642.
  23. Stanford alpaca: An instruction-following llama model.
  24. BlueLM Team. 2023. Bluelm: An open multilingual 7b language model. https://github.com/vivo-ai-lab/BlueLM.
  25. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  26. Attention is all you need. Advances in neural information processing systems, 30.
  27. Is chatgpt a good nlg evaluator? a preliminary study. arXiv preprint arXiv:2303.04048.
  28. Label words are anchors: An information flow perspective for understanding in-context learning. arXiv preprint arXiv:2305.14160.
  29. Simlm: Pre-training with representation bottleneck for dense passage retrieval. arXiv preprint arXiv:2207.02578.
  30. Large search model: Redefining search stack in the era of llms. In ACM SIGIR Forum, volume 57, pages 1–16. ACM New York, NY, USA.
  31. Is chatgpt a good sentiment analyzer? a preliminary study. arXiv preprint arXiv:2304.04339.
  32. Neural network acceptability judgments. arXiv preprint 1805.12471.
  33. Zero-shot information extraction via chatting with chatgpt. arXiv preprint arXiv:2302.10205.
  34. Prompt compression and contrastive conditioning for controllability and toxicity reduction in language models. arXiv preprint arXiv:2210.03162.
  35. An explanation of in-context learning as implicit bayesian inference. arXiv preprint arXiv:2111.02080.
  36. Exploring the limits of chatgpt for query or aspect-based text summarization. arXiv preprint arXiv:2302.08081.
  37. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068.
  38. Judging llm-as-a-judge with mt-bench and chatbot arena. arXiv preprint arXiv:2306.05685.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets