KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs

Published 16 Jun 2024 in cs.CL and cs.AI | (2406.10802v1)

Abstract: Existing frameworks for assessing robustness of LLMs overly depend on specific benchmarks, increasing costs and failing to evaluate performance of LLMs in professional domains due to dataset limitations. This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs). Our framework generates original prompts from the triplets of knowledge graphs and creates adversarial prompts by poisoning, assessing the robustness of LLMs through the results of these adversarial attacks. We systematically evaluate the effectiveness of this framework and its modules. Experiments show that adversarial robustness of the ChatGPT family ranks as GPT-4-turbo > GPT-4o > GPT-3.5-turbo, and the robustness of LLMs is influenced by the professional domains in which they operate.