Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Empirical Investigation of Commonsense Self-Supervision with Knowledge Graphs (2205.10661v1)

Published 21 May 2022 in cs.CL and cs.AI

Abstract: Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of LLMs, in zero-shot evaluation on various downstream language reasoning tasks. Since these improvements are reported in aggregate, however, little is known about (i) how to select the appropriate knowledge for solid performance across tasks, (ii) how to combine this knowledge with neural LLMs, and (iii) how these pairings affect granular task performance. In this paper, we study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting LLMs. We study the effect of different synthetic datasets on LLMs with various architectures and sizes. The resulting models are evaluated against four task properties: domain overlap, answer similarity, vocabulary overlap, and answer length. Our experiments show that encoder-decoder models benefit from more data to learn from, whereas sampling strategies that balance across different aspects yield best performance. Most of the improvement occurs on questions with short answers and dissimilar answer candidates, which corresponds to the characteristics of the data used for pre-training.

Citations (5)

Summary

We haven't generated a summary for this paper yet.