Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
98 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GraphextQA: A Benchmark for Evaluating Graph-Enhanced Large Language Models (2310.08487v1)

Published 12 Oct 2023 in cs.CL

Abstract: While multi-modal models have successfully integrated information from image, video, and audio modalities, integrating graph modality into LLMs remains unexplored. This discrepancy largely stems from the inherent divergence between structured graph data and unstructured text data. Incorporating graph knowledge provides a reliable source of information, enabling potential solutions to address issues in text generation, e.g., hallucination, and lack of domain knowledge. To evaluate the integration of graph knowledge into LLMs, a dedicated dataset is needed. However, there is currently no benchmark dataset specifically designed for multimodal graph-LLMs. To address this gap, we propose GraphextQA, a question answering dataset with paired subgraphs, retrieved from Wikidata, to facilitate the evaluation and future development of graph-LLMs. Additionally, we introduce a baseline model called CrossGNN, which conditions answer generation on the paired graphs by cross-attending question-aware graph features at decoding. The proposed dataset is designed to evaluate graph-LLMs' ability to understand graphs and make use of it for answer generation. We perform experiments with language-only models and the proposed graph-LLM to validate the usefulness of the paired graphs and to demonstrate the difficulty of the task.

Citations (1)

Summary

We haven't generated a summary for this paper yet.