Emergent Mind

Abstract

Recently, ChatGPT, a representative LLM, has gained considerable attention due to its powerful emergent abilities. Some researchers suggest that LLMs could potentially replace structured knowledge bases like knowledge graphs (KGs) and function as parameterized knowledge bases. However, while LLMs are proficient at learning probabilistic language patterns based on large corpus and engaging in conversations with humans, they, like previous smaller pre-trained language models (PLMs), still have difficulty in recalling facts while generating knowledge-grounded contents. To overcome these limitations, researchers have proposed enhancing data-driven PLMs with knowledge-based KGs to incorporate explicit factual knowledge into PLMs, thus improving their performance to generate texts requiring factual knowledge and providing more informed responses to user queries. This paper reviews the studies on enhancing PLMs with KGs, detailing existing knowledge graph enhanced pre-trained language models (KGPLMs) as well as their applications. Inspired by existing studies on KGPLM, this paper proposes to enhance LLMs with KGs by developing knowledge graph-enhanced LLMs (KGLLMs). KGLLM provides a solution to enhance LLMs' factual reasoning ability, opening up new avenues for LLM research.

Overview

  • LLMs like ChatGPT are adept at generating human-like text but are limited in factual accuracy and knowledge-based content.

  • Knowledge graphs (KGs) are structured databases containing real-world facts and relationships, aiding in tasks that require factual correctness.

  • The paper proposes knowledge graph-enhanced pre-trained language models (KGPLMs) to integrate KGs into LLMs, improving factual reasoning and response accuracy.

  • Various approaches to integrating KGs into LLMs include before-training, during-training, and post-training enhancements.

  • KGPLMs could vastly benefit applications such as entity recognition, relation extraction, sentiment analysis, knowledge graph completion, question answering, and natural language generation.

Understanding the Intersection of Language Models and Knowledge Graphs

The Relationship Between LLMs and KGs

In the world of artificial intelligence, language models have become increasingly sophisticated. LLMs such as ChatGPT have made strides in understanding and generating human-like text. These models have been trained on vast amounts of data, allowing them to produce content that is often coherent and contextually relevant. However, when it comes to factual accuracy and knowledge-based content generation, LLMs appear to have limitations. They generally do well with the information they were trained on but struggle to recall, apply, or update knowledge that wasn't covered in their training sets.

This is where knowledge graphs (KGs) come into play. KGs are structured databases of real-world facts and relationships and offer an explicit representation of knowledge. They store and present information in a way that is not only accessible but can be easily updated and maintained. As a result, KGs have inherent advantages for tasks requiring factual correctness and up-to-date information.

Enhancing Language Models with Knowledge Graphs

Enhancing language models with knowledge from KGs — termed here as knowledge graph-enhanced pre-trained language models (KGPLMs) — has been proposed to overcome the limitations of LLMs in factual content generation. The integration of KGs into LLMs is expected to improve the models’ abilities to reason with facts, thus producing more informed and accurate responses.

Approaches to Integrating KGs into LLMs

KGPLMs integrate knowledge into language models in various ways.

  • Before-training Enhancement: This approach introduces KG data before the model training process begins, essentially preprocessing the text to include KG information or adjusting the input dataset to be more knowledge-rich.
  • During-training Enhancement: Here, the model architecture itself is altered or special knowledge-processing components are introduced during training to enable the model to learn directly from both the textual data and the KGs concurrently.
  • Post-training Enhancement: In this phase, language models are fine-tuned using domain-specific knowledge and tasks, which enhances the model's ability to perform in specific applications that need detailed expert knowledge.

Applications of Knowledge-Infused Language Models

Applications of KGPLMs are vast and range across several tasks:

  • Named Entity Recognition: KGPLMs can more effectively identify specific entities within the text, improving the extraction of usable data from unstructured sources.
  • Relation Extraction: They can better understand and categorize relationships between entities, which enhances comprehension and response accuracy.
  • Sentiment Analysis: By integrating sentiment-specific knowledge into the models, their ability to interpret emotions and opinions is significantly improved.
  • Knowledge Graph Completion: KGPLMs can help to fill in missing information in KGs, thereby expanding and refreshing the stored knowledge.
  • Question Answering: Utilizing KGs allows for structured reasoning and more precise answers to complex questions.
  • Natural Language Generation: Integrating KGs with LLMs can lead to more relevant, factual, and contextually grounded content.

The Future: LLMs Enhanced with Knowledge Graphs

While LLMs have shown potential as knowledge bases, their ability to serve as reliable sources of information is questioned. They often default to probabilistic language patterns acquired during training rather than solid factual recall. By incorporating KGs, however, we can begin to build more robust, factual, and reliable language models, leading to what are being referred to as knowledge graph-enhanced LLMs (KGLLMs).

Developing effective KGLLMs requires careful consideration, such as selecting valuable knowledge to incorporate and methods to avoid the loss of previously learned information. It is also critical to explore ways to improve model interpretability and assess their performance on domain-specific tasks.

Conclusion: The Symbiosis of LLMs and KGs

As LLMs evolve, it becomes increasingly clear that their full potential can only be unlocked through the integration of structured knowledge from KGs. The marriage of LLMs and KGs does not suggest the obsolescence of either; rather, it heralds a new phase of symbiotic development. KGs provide the foundation of fact-based reasoning that LLMs need to become truly intelligent systems, while LLMs offer flexible, self-updating mechanisms that keep the knowledge within KGs fluid and accessible. Together, they will redefine the possibilities for artificial intelligence in processing and generating human language.

Subscribe by Email

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.

References
  1. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019, pp. 4171–4186.
  2. A. Radford, K. Narasimhan, T. Salimans, I. Sutskever et al., “Improving language understanding by generative pre-training,” 2018.
  3. C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” Journal of Machine Learning Research, vol. 21, no. 140, pp. 1–67
  4. Pretrained Language Models for Text Generation: A Survey
  5. Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation
  6. L. Dong, N. Yang, W. Wang, F. Wei, X. Liu, Y. Wang, J. Gao, M. Zhou, and H.-W. Hon, “Unified language model pre-training for natural language understanding and generation,” in Advances in Neural Information Processing Systems
  7. Emergent Abilities of Large Language Models
  8. F. Petroni, T. Rocktäschel, P. Lewis, A. Bakhtin, Y. Wu, A. H. Miller, and S. Riedel, “Language models as knowledge bases?” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019, pp. 2463–2473.
  9. Language Models are Open Knowledge Graphs
  10. B. Heinzerling and K. Inui, “Language models as knowledge bases: On entity representations, storage capacity, and paraphrased queries,” in Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, 2021, pp. 1772–1791.
  11. C. Wang, P. Liu, and Y. Zhang, “Can generative pre-trained language models serve as knowledge bases for closed-book qa?” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021, pp. 3241–3251.
  12. B. Cao, H. Lin, X. Han, L. Sun, L. Yan, M. Liao, T. Xue, and J. Xu, “Knowledgeable or educated guess? revisiting language models as knowledge bases,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021, pp. 1860–1874.
  13. Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4
  14. A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity
  15. H. M. Yohannes and T. Amagasa, “Named-entity recognition for a low-resource language using pre-trained language model,” in Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing, 2022, p. 837–844.
  16. Y. Han, L. Qiao, D. Li, and X. Liao, “Review of knowledge-enhanced pre-trained language models,” Journal of Frontiers of Computer Science and Technology, pp. 1439–1461
  17. Knowledge Enhanced Pretrained Language Models: A Compreshensive Survey
  18. A Survey of Knowledge Enhanced Pre-trained Models
  19. A Survey on Knowledge-Enhanced Pre-trained Language Models
  20. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Communications of the ACM, vol. 60, no. 6, p. 84–90
  21. J. Zhou, J. Tian, R. Wang, Y. Wu, W. Xiao, and L. He, “SentiX: A sentiment-aware pre-trained model for cross-domain sentiment analysis,” in Proceedings of the 28th International Conference on Computational Linguistics, 2020, pp. 568–579.
  22. SMedBERT: A Knowledge-Enhanced Pre-trained Language Model with Structured Semantics for Medical Text Mining
  23. F. Mi, Y. Wang, and Y. Li, “Cins: Comprehensive instruction for few-shot learning in task-oriented dialog systems,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2022, pp. 11 076–11 084.
  24. U. Naseem, I. Razzak, S. K. Khan, and M. Prasad, “A comprehensive survey on word representation models: From classical to state-of-the-art word representation language models,” ACM Transactions on Asian Low-Resource Language Information Processing, vol. 20, no. 5
  25. H. Wang, J. Li, H. Wu, E. Hovy, and Y. Sun, “Pre-trained language models and their applications,” Engineering
  26. Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, “A neural probabilistic language model,” Jounal of Machine Learning Research, vol. 3, pp. 1137–1155
  27. R. Collobert and J. Weston, “A unified architecture for natural language processing: Deep neural networks with multitask learning,” in Proceedings of the 25th international conference on Machine learning, 2008, pp. 160–167.
  28. Efficient Estimation of Word Representations in Vector Space
  29. P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching word vectors with subword information,” Transactions of the association for computational linguistics, vol. 5, pp. 135–146
  30. J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 conference on empirical methods in natural language processing, 2014, pp. 1532–1543.
  31. Deep contextualized word representations
  32. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Proceedings of the 31st Conference on Neural Information Processing Systems
  33. RoBERTa: A Robustly Optimized BERT Pretraining Approach
  34. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, pp. 1–9
  35. M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, and L. Zettlemoyer, “BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 7871–7880.
  36. M. Joshi, D. Chen, Y. Liu, D. S. Weld, L. Zettlemoyer, and O. Levy, “SpanBERT: Improving pre-training by representing and predicting spans,” Transactions of the Association for Computational Linguistics, vol. 8, pp. 64–77
  37. L. Dong, N. Yang, W. Wang, F. Wei, X. Liu, Y. Wang, J. Gao, M. Zhou, and H.-W. Hon, “Unified language model pre-training for natural language understanding and generation,” in Proceedings of the 33rd Conference on Neural Information Processing Systems
  38. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
  39. M. D. M. Reddy, M. S. M. Basha, M. M. C. Hari, and M. N. Penchalaiah, “Dall-e: Creating images from text,” UGC Care Group I Journal, vol. 8, no. 14, pp. 71–75
  40. M6: A Chinese Multimodal Pretrainer
  41. K. Clark, M.-T. Luong, Q. V. Le, and C. D. Manning, “ELECTRA: Pre-training text encoders as discriminators rather than generators,” in Proceedings of the 8th International Conference on Learning Representations
  42. Y. Wang, C. Sun, Y. Wu, J. Yan, P. Gao, and G. Xie, “Pre-training entity relation encoder with intra-span and inter-span information,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020, pp. 1692–1705.
  43. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
  44. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, and A. Askell, “Language models are few-shot learners,” in Advances in neural information processing systems, 2020, pp. 1877–1901.
  45. PaLM: Scaling Language Modeling with Pathways
  46. A Survey of Large Language Models
  47. J.-W. Lu, C. Guo, X.-Y. Dai, Q.-H. Miao, and F.-Y. W. Xing-Xia Wang, Jing Yang, “The chatgpt after: Opportunities and challenges of very large scale pre-trained models,” Acta Automatica Sinica, vol. 49, no. 4, pp. 705–717
  48. P. F. Christiano, J. Leike, T. Brown, M. Martic, S. Legg, and D. Amodei, “Deep reinforcement learning from human preferences,” in Advances in neural information processing systems
  49. N. Stiennon, L. Ouyang, J. Wu, D. Ziegler, R. Lowe, C. Voss, A. Radford, D. Amodei, and P. F. Christiano, “Learning to summarize with human feedback,” in Advances in Neural Information Processing Systems, 2020, pp. 3008–3021.
  50. P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing,” ACM Computing Surveys, vol. 55, no. 9, pp. 1–35
  51. Proximal Policy Optimization Algorithms
  52. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
  53. GPT-4 Technical Report
  54. Sparks of Artificial General Intelligence: Early experiments with GPT-4
  55. LaMDA: Language Models for Dialog Applications
  56. Scaling Instruction-Finetuned Language Models
  57. PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing
  58. A Survey on In-context Learning
  59. Finetuned Language Models Are Zero-Shot Learners
  60. Tool Learning with Foundation Models
  61. K. Yang, Y. Tian, N. Peng, and K. Dan, “Re3: Generating longer stories with recursive reprompting and revision,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022, p. 4393–4479.
  62. Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions
  63. Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought
  64. Language Model Behavior: A Comprehensive Survey
  65. S. Gehman, S. Gururangan, M. Sap, Y. Choi, and N. A. Smith, “RealToxicityPrompts: Evaluating neural toxic degeneration in language models,” in Findings of the Association for Computational Linguistics, 2020, pp. 3356–3369.
  66. R. Zellers, A. Holtzman, E. Clark, L. Qin, A. Farhadi, and Y. Choi, “TuringAdvice: A generative and dynamic evaluation of language use,” in Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021, pp. 4856–4880.
  67. Y. Li, B. Yu, X. Mengge, and T. Liu, “Enhancing pre-trained Chinese character representation with word-aligned attention,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 3442–3448.
  68. L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, and A. Ray, “Training language models to follow instructions with human feedback,” in Advances in Neural Information Processing Systems, 2022, pp. 27 730–27 744.
  69. W. Liu, P. Zhou, Z. Zhao, Z. Wang, Q. Ju, H. Deng, and P. Wang, “K-bert: Enabling language representation with knowledge graph,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2020, pp. 2901–2908.
  70. T. Sun, Y. Shao, X. Qiu, Q. Guo, Y. Hu, X. Huang, and Z. Zhang, “CoLAKE: Contextualized language and knowledge embedding,” in Proceedings of the 28th International Conference on Computational Linguistics, 2020, pp. 3660–3670.
  71. Y. Zhang, J. Lin, Y. Fan, P. Jin, Y. Liu, and B. Liu, “Cn-hit-it. nlp at semeval-2020 task 4: Enhanced language representation with multiple knowledge triples,” in Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020, pp. 494–500.
  72. I. Yamada, A. Asai, H. Shindo, H. Takeda, and Y. Matsumoto, “LUKE: Deep contextualized entity representations with entity-aware self-attention,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, pp. 6442–6454.
  73. N. Poerner, U. Waltinger, and H. Schütze, “E-BERT: Efficient-yet-effective entity embeddings for BERT,” in Findings of the Association for Computational Linguistics: EMNLP 2020, 2020, pp. 803–818.
  74. Knowledge-Aware Language Model Pretraining
  75. X. Liu, D. Yin, J. Zheng, X. Zhang, P. Zhang, H. Yang, Y. Dong, and J. Tang, “Oag-bert: Towards a unified backbone language model for academic knowledge services,” in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022, p. 3418–3428.
  76. T. Zhang, C. Wang, N. Hu, M. Qiu, C. Tang, X. He, and J. Huang, “Dkplm: Decomposable knowledge-enhanced pre-trained language model for natural language understanding,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2022, pp. 11 703–11 711.
  77. A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, and O. Yakhnenko, “Translating embeddings for modeling multi-relational data,” in Advances in Neural Information Processing Systems
  78. Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models
  79. W. Chen, Y. Su, X. Yan, and W. Y. Wang, “KGPT: Knowledge-grounded pre-training for data-to-text generation,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, pp. 8635–8648.
  80. R. Logan, N. F. Liu, M. E. Peters, M. Gardner, and S. Singh, “Barack’s wife hillary: Using knowledge graphs for fact-aware language modeling,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 5962–5971.
  81. M. Sap, R. Le Bras, E. Allaway, C. Bhagavatula, N. Lourie, H. Rashkin, B. Roof, N. A. Smith, and Y. Choi, “Atomic: An atlas of machine commonsense for if-then reasoning,” in Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, 2019, p. 3027–3035.
  82. X. Wang, T. Gao, Z. Zhu, Z. Zhang, Z. Liu, J. Li, and J. Tang, “Kepler: A unified model for knowledge embedding and pre-trained language representation,” Transactions of the Association for Computational Linguistics, vol. 9, pp. 176–194
  83. Should You Mask 15% in Masked Language Modeling?
  84. Z. Zhang, X. Han, Z. Liu, X. Jiang, M. Sun, and Q. Liu, “ERNIE: Enhanced language representation with informative entities,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 1441–1451.
  85. Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model
  86. T. Shen, Y. Mao, P. He, G. Long, A. Trischler, and W. Chen, “Exploiting structured knowledge in text via graph-guided representation learning,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020, p. 8980–8994.
  87. ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
  88. B. He, D. Zhou, J. Xiao, X. Jiang, Q. Liu, N. J. Yuan, and T. Xu, “BERT-MK: Integrating graph contextualized knowledge into pre-trained language models,” in Findings of the Association for Computational Linguistics: EMNLP 2020, 2020, pp. 2281–2290.
  89. Y. Su, X. Han, Z. Zhang, Y. Lin, P. Li, Z. Liu, J. Zhou, and M. Sun, “Cokebert: Contextual knowledge selection and embedding towards enhanced pre-trained language models,” AI Open, vol. 2, pp. 127–134
  90. Y. Sun, Q. Shi, L. Qi, and Y. Zhang, “JointLK: Joint reasoning with language models and knowledge graphs for commonsense question answering,” in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022, pp. 5049–5060.
  91. P. Zhong, D. Wang, and C. Miao, “Knowledge-enriched transformer for emotion detection in textual conversations,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019, pp. 165–176.
  92. Q. Liu, D. Yogatama, and P. Blunsom, “Relational memory-augmented language models,” Transactions of the Association for Computational Linguistics, vol. 10, pp. 555–572
  93. M. Yasunaga, H. Ren, A. Bosselut, P. Liang, and J. Leskovec, “QA-GNN: Reasoning with language models and knowledge graphs for question answering,” in Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online, 2021, pp. 535–546.
  94. GreaseLM: Graph REASoning Enhanced Language Models for Question Answering
  95. L. He, S. Zheng, T. Yang, and F. Zhang, “KLMo: Knowledge graph enhanced pretrained language model with fine-grained relationships,” in Findings of the Association for Computational Linguistics: EMNLP 2021, 2021, pp. 4536–4542.
  96. M. E. Peters, M. Neumann, R. Logan, R. Schwartz, V. Joshi, S. Singh, and N. A. Smith, “Knowledge enhanced contextual word representations,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019, pp. 43–54.
  97. D. Yu, C. Zhu, Y. Yang, and M. Zeng, “Jaket: Joint pre-training of knowledge graph and language understanding,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2022, pp. 11 630–11 638.
  98. Y. Liu, Y. Wan, L. He, H. Peng, and S. Y. Philip, “Kg-bart: Knowledge graph-augmented bart for generative commonsense reasoning,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2021, pp. 6418–6425.
  99. R. Wang, D. Tang, N. Duan, Z. Wei, X. Huang, G. Cao, D. Jiang, and M. Zhou, “K-adapter: Infusing knowledge into pre-trained models with adapters,” in Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021, p. 1405–1418.
  100. A. Lauscher, O. Majewska, L. F. R. Ribeiro, I. Gurevych, N. Rozanov, and G. Glavaš, “Common sense or world knowledge? investigating adapter-based knowledge injection into pretrained transformers,” in Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, 2020, pp. 43–49.
  101. Q. Lu, D. Dou, and T. H. Nguyen, “Parameter-efficient domain knowledge integration from multiple sources for biomedical pre-trained language models,” in Findings of the Association for Computational Linguistics: EMNLP 2021, 2021, pp. 3855–3865.
  102. G. Lu, H. Yu, Z. Yan, and Y. Xue, “Commonsense knowledge graph-based adapter for aspect-level sentiment classification,” Neurocomputing, vol. 534, pp. 67–76
  103. Y. Levine, B. Lenz, O. Dagan, O. Ram, D. Padnos, O. Sharir, S. Shalev-Shwartz, A. Shashua, and Y. Shoham, “SenseBERT: Driving some sense into BERT,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 4656–4667.
  104. Y. Qin, Y. Lin, R. Takanobu, Z. Liu, P. Li, H. Ji, M. Huang, M. Sun, and J. Zhou, “ERICA: Improving entity and relation understanding for pre-trained language models via contrastive learning,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021, pp. 3350–3363.
  105. P. Ke, H. Ji, S. Liu, X. Zhu, and M. Huang, “SentiLARE: Sentiment-aware language representation learning with linguistic knowledge,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, Online, 2020, pp. 6975–6988.
  106. Y. Sun, S. Wang, Y. Li, S. Feng, H. Tian, H. Wu, and H. Wang, “Ernie 2.0: A continual pre-training framework for language understanding,” in Proceedings of the AAAI conference on artificial intelligence, 2020, pp. 8968–8975.
  107. M. Yasunaga, A. Bosselut, H. Ren, X. Zhang, C. D. Manning, P. S. Liang, and J. Leskovec, “Deep bidirectional language-knowledge graph pretraining,” in Advances in Neural Information Processing Systems, 2022, pp. 37 309–37 323.
  108. H. Hayashi, Z. Hu, C. Xiong, and G. Neubig, “Latent relation language models,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2020, pp. 7911–7918.
  109. T. Schick and H. Schütze, “Exploiting cloze-questions for few-shot text classification and natural language inference,” in Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, 2021, pp. 255–269.
  110. M. Kang, J. Baek, and S. J. Hwang, “KALA: knowledge-augmented language model adaptation,” in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022, pp. 5144–5167.
  111. Q. Xie, J. A. Bishop, P. Tiwari, and S. Ananiadou, “Pre-trained language models with domain knowledge for biomedical extractive summarization,” Knowledge-Based Systems, vol. 252, p. 109460
  112. B. Y. Lin, X. Chen, J. Chen, and X. Ren, “KagNet: Knowledge-aware graph networks for commonsense reasoning,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019, pp. 2829–2839.
  113. H. Fei, Y. Ren, Y. Zhang, D. Ji, and X. Liang, “Enriching contextualized language model from knowledge graph for biomedical information extraction,” Briefings in bioinformatics, vol. 22, no. 3, p. bbaa110
  114. T.-Y. Chang, Y. Liu, K. Gopalakrishnan, B. Hedayatnia, P. Zhou, and D. Hakkani-Tur, “Incorporating commonsense knowledge graph in pretrained models for social commonsense tasks,” in Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures.   Online: Association for Computational Linguistics, Nov. 2020, pp. 74–79. [Online]. Available: https://aclanthology.org/2020.deelio-1.9

  115. N. Bian, X. Han, B. Chen, and L. Sun, “Benchmarking knowledge-enhanced commonsense question answering via knowledge-to-text transformation,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2021, pp. 12 574–12 582.
  116. B. R. Andrus, Y. Nasiri, S. Cui, B. Cullen, and N. Fulda, “Enhanced story comprehension for large language models through dynamic document-based knowledge graphs,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2022, pp. 10 436–10 444.
  117. J. Wang, W. Huang, Q. Shi, H. Wang, M. Qiu, X. Li, and M. Gao, “Knowledge prompting in pre-trained language model for natural language understanding,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022, pp. 3164–3177.
  118. End-to-end Named Entity Recognition and Relation Extraction using Pre-trained Language Models
  119. Z. Yuan, Y. Liu, C. Tan, S. Huang, and F. Huang, “Improving biomedical pretrained language models with knowledge,” in Proceedings of the BioNLP 2021 workshop, 2021, pp. 180–190.
  120. D. Seyler, T. Dembelova, L. Del Corro, J. Hoffart, and G. Weikum, “A study of the importance of external knowledge in the named entity recognition task,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018, pp. 241–246.
  121. Q. He, L. Wu, Y. Yin, and H. Cai, “Knowledge-graph augmented word representations for named entity recognition,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2020, pp. 7919–7926.
  122. Y. Song, W. Zhang, Y. Ye, C. Zhang, and K. Zhang, “Knowledge-enhanced relation extraction in chinese emrs,” in Proceedings of the 2022 5th International Conference on Machine Learning and Natural Language Processing, 2023, p. 196–201.
  123. J. Li, Y. Katsis, T. Baldwin, H.-C. Kim, A. Bartko, J. McAuley, and C.-N. Hsu, “Spot: Knowledge-enhanced language representations for information extraction,” in Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022, p. 1124–1134.
  124. A. Roy and S. Pan, “Incorporating medical knowledge in BERT for clinical relation extraction,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 5357–5366.
  125. Ö. Uzuner, B. R. South, S. Shen, and S. L. DuVall, “2010 i2b2/va challenge on concepts, assertions, and relations in clinical text,” Journal of the American Medical Informatics Association, vol. 18, no. 5, pp. 552–556
  126. R. Socher, A. Perelygin, J. Wu, J. Chuang, C. D. Manning, A. Ng, and C. Potts, “Recursive deep models for semantic compositionality over a sentiment treebank,” in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013, pp. 1631–1642.
  127. B. Pang and L. Lee, “Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales,” in Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, 2005, pp. 115–124.
  128. X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional networks for text classification,” in Advances in neural information processing systems
  129. KG-BERT: BERT for Knowledge Graph Completion
  130. Q. Lin, R. Mao, J. Liu, F. Xu, and E. Cambria, “Fusing topology contexts and logical rules in language models for knowledge graph completion,” Information Fusion, vol. 90, pp. 253–264
  131. Z. Hu, Y. Xu, W. Yu, S. Wang, Z. Yang, C. Zhu, K.-W. Chang, and Y. Sun, “Empowering language models with knowledge graph reasoning for open-domain question answering,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022, pp. 9562–9581.
  132. M. Joshi, E. Choi, D. Weld, and L. Zettlemoyer, “TriviaQA: A large scale distantly supervised challenge dataset for reading comprehension,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017, pp. 1601–1611.
  133. SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine
  134. J. Berant, A. Chou, R. Frostig, and P. Liang, “Semantic parsing on Freebase from question-answer pairs,” in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013, pp. 1533–1544.
  135. J. Guan, F. Huang, Z. Zhao, X. Zhu, and M. Huang, “A knowledge-enhanced pretraining model for commonsense story generation,” Transactions of the Association for Computational Linguistics, vol. 8, pp. 93–108
  136. H. Ji, P. Ke, S. Huang, F. Wei, X. Zhu, and M. Huang, “Language generation with multi-hop reasoning on commonsense knowledge graph,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020, pp. 725–736.
  137. X. Yang and I. Tiddi, “Creative storytelling with language models and knowledge graphs,” in Proceedings of the CIKM 2020 Workshops
  138. L. Du, X. Ding, T. Liu, and B. Qin, “Learning event graph knowledge for abductive reasoning,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021, pp. 5181–5190.
  139. A Review on Language Models as Knowledge Bases
  140. A. Talmor, Y. Elazar, Y. Goldberg, and J. Berant, “olmpics-on what language model pre-training captures,” Transactions of the Association for Computational Linguistics, vol. 8, pp. 743–758
  141. C. Wang, S. Liang, Y. Zhang, X. Li, and T. Gao, “Does it make sense? and why? a pilot study for sense making and explanation,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 4020–4026.
  142. M. Sung, J. Lee, S. Yi, M. Jeon, S. Kim, and J. Kang, “Can language models be biomedical knowledge bases?” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 4723–4734.
  143. P. Rajpurkar, R. Jia, and P. Liang, “Know what you don’t know: Unanswerable questions for SQuAD,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018, pp. 784–789.
  144. R. Zhao, F. Zhao, G. Xu, S. Zhang, and H. Jin, “Can language models serve as temporal knowledge bases?” in Findings of the Association for Computational Linguistics: EMNLP 2022, 2022, pp. 2024–2037.
  145. Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models
  146. H. Elsahar, P. Vougiouklis, A. Remaci, C. Gravier, J. Hare, F. Laforest, and E. Simperl, “T-REx: A large scale alignment of natural language with knowledge base triples,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018, pp. 3448–3452.
  147. T. Pires, E. Schlinger, and D. Garrette, “How multilingual is multilingual BERT?” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 4996–5001.
  148. Can Prompt Probe Pretrained Language Models? Understanding the Invisible Risks from a Causal View
  149. Large Language Models Struggle to Learn Long-Tail Knowledge
  150. BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from Pretrained Language Models
  151. A. Bosselut, H. Rashkin, M. Sap, C. Malaviya, A. Celikyilmaz, and Y. Choi, “COMET: Commonsense transformers for automatic knowledge graph construction,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 4762–4779.
  152. PaLM-E: An Embodied Multimodal Language Model
  153. What Has Been Enhanced in my Knowledge-Enhanced Language Model?
  154. H. Schuff, H.-Y. Yang, H. Adel, and N. T. Vu, “Does external knowledge help explainable natural language inference? automatic evaluation vs. human ratings,” in Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 2021, pp. 26–41.
  155. K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 2002, pp. 311–318.
  156. T. Sellam, D. Das, and A. Parikh, “BLEURT: Learning robust metrics for text generation,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 7881–7892.
  157. LMExplainer: a Knowledge-Enhanced Explainer for Language Models

Show All 157