Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
98 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Compressing Sentence Representation with maximum Coding Rate Reduction (2304.12674v1)

Published 25 Apr 2023 in cs.CL and cs.LG

Abstract: In most natural language inference problems, sentence representation is needed for semantic retrieval tasks. In recent years, pre-trained LLMs have been quite effective for computing such representations. These models produce high-dimensional sentence embeddings. An evident performance gap between large and small models exists in practice. Hence, due to space and time hardware limitations, there is a need to attain comparable results when using the smaller model, which is usually a distilled version of the LLM. In this paper, we assess the model distillation of the sentence representation model Sentence-BERT by augmenting the pre-trained distilled model with a projection layer additionally learned on the Maximum Coding Rate Reduction (MCR2)objective, a novel approach developed for general-purpose manifold clustering. We demonstrate that the new LLM with reduced complexity and sentence embedding size can achieve comparable results on semantic retrieval benchmarks.

Citations (1)

Summary

We haven't generated a summary for this paper yet.