Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 34 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 80 tok/s Pro
Kimi K2 198 tok/s Pro
GPT OSS 120B 461 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

L-TUNING: Synchronized Label Tuning for Prompt and Prefix in LLMs (2402.01643v2)

Published 21 Dec 2023 in cs.CL, cs.AI, and cs.LG

Abstract: Efficiently fine-tuning LLMs for specific tasks presents a considerable challenge in natural language processing. Traditional methods, like prompt or prefix tuning, typically rely on arbitrary tokens for training, leading to prolonged training times and generalized token use across various class labels. To address these issues, this paper introduces L-Tuning, an efficient fine-tuning approach designed for classification tasks within the Natural Language Inference (NLI) framework. Diverging from conventional methods, L-Tuning focuses on the fine-tuning of label tokens processed through a pre-trained LLM, thereby harnessing its pre-existing semantic knowledge. This technique not only improves the fine-tuning accuracy and efficiency but also facilitates the generation of distinct label embeddings for each class, enhancing the model's training nuance. Our experimental results indicate a significant improvement in training efficiency and classification accuracy with L-Tuning compared to traditional approaches, marking a promising advancement in fine-tuning LLMs for complex language tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. BERT: Pre-training of deep bidirectional transformers for language understanding. In Jill Burstein, Christy Doran, and Thamar Solorio (eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp.  4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. doi: 10.18653/v1/N19-1423. URL https://aclanthology.org/N19-1423.
  2. Openagi: When llm meets domain experts. arXiv preprint arXiv:2304.04370, 2023.
  3. Ppt: Pre-trained prompt tuning for few-shot learning. arXiv preprint arXiv:2109.04332, 2021.
  4. Ptr: Prompt tuning with rules for text classification. AI Open, 3:182–192, 2022.
  5. Deberta: Decoding-enhanced bert with disentangled attention, 2021.
  6. Contrastive learning for universal zero-shot nli with cross-lingual sentence embeddings. In Proceedings of the 3rd Workshop on Multi-lingual Representation Learning (MRL), pp.  239–252, 2023.
  7. The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691, 2021.
  8. P-tuning: Prompt tuning can be comparable to fine-tuning across scales and tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp.  61–68, 2022.
  9. Roberta: A robustly optimized bert pretraining approach, 2019.
  10. The refinedweb dataset for falcon llm: Outperforming curated corpora with web data, and web data only, 2023.
  11. Soft prompt tuning for augmenting dense retrieval with large language models. arXiv preprint arXiv:2307.08303, 2023.
  12. Self-attention encoding and pooling for speaker recognition. arXiv preprint arXiv:2008.01077, 2020.
  13. Llama 2: Open foundation and fine-tuned chat models, 2023.
  14. GLUE: A multi-task benchmark and analysis platform for natural language understanding. 2019. In the Proceedings of ICLR.
  15. Bloom: A 176b-parameter open-access multilingual language model, 2023.
Citations (3)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.