Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 179 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 40 tok/s Pro
GPT-5 High 35 tok/s Pro
GPT-4o 103 tok/s Pro
Kimi K2 207 tok/s Pro
GPT OSS 120B 451 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining (2403.04780v3)

Published 2 Mar 2024 in cs.CL and cs.AI

Abstract: Graphs with abundant attributes are essential in modeling interconnected entities and enhancing predictions across various real-world applications. Traditional Graph Neural Networks (GNNs) often require re-training for different graph tasks and datasets. Although the emergence of LLMs has introduced new paradigms in natural language processing, their potential for generic graph mining, training a single model to simultaneously handle diverse tasks and datasets, remains under-explored. To this end, our novel framework MuseGraph, seamlessly integrates the strengths of GNNs and LLMs into one foundation model for graph mining across tasks and datasets. This framework first features a compact graph description to encapsulate key graph information within language token limitations. Then, we propose a diverse instruction generation mechanism with Chain-of-Thought (CoT)-based instruction packages to distill the reasoning capabilities from advanced LLMs like GPT-4. Finally, we design a graph-aware instruction tuning strategy to facilitate mutual enhancement across multiple tasks and datasets while preventing catastrophic forgetting of LLMs' generative abilities. Our experimental results demonstrate significant improvements in five graph tasks and ten datasets, showcasing the potential of our MuseGraph in enhancing the accuracy of graph-oriented downstream tasks while improving the generation abilities of LLMs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (80)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
  2. Dbpedia: A nucleus for a web of open data, in ‘the semantic web’, vol. 4825 of lecture notes in computer science, 2007.
  3. S. Banerjee and A. Lavie. Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pages 65–72, 2005.
  4. A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Trans. Knowl. Data Eng., 30(9):1616–1637, 2018.
  5. Exploring the potential of large language models (llms) in learning on graphs. arXiv preprint arXiv:2307.03393, 2023.
  6. Label-free node classification on graphs with large language models (llms). arXiv preprint arXiv:2310.04668, 2023.
  7. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality. See https://vicuna. lmsys. org (accessed 14 April 2023), 2023.
  8. T. Cooray and N.-M. Cheung. Graph-wise common latent factor extraction for unsupervised graph representation learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 6420–6428, 2022.
  9. Adaptive graph encoder for attributed graph embedding. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pages 976–985, 2020.
  10. Qlora: Efficient finetuning of quantized llms. arXiv preprint arXiv:2305.14314, 2023.
  11. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pages 135–144, 2017.
  12. Hyperbolic geometric graph representation learning for hierarchy-imbalance node classification. In Proceedings of the ACM Web Conference 2023, pages 460–468, 2023.
  13. Creating training corpora for nlg micro-planners. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 179–188. Association for Computational Linguistics, 2017.
  14. A. Grover and J. Leskovec. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pages 855–864, 2016.
  15. Gpt4graph: Can large language models understand graph structured data? an empirical evaluation and benchmarking. arXiv preprint arXiv:2305.15066, 2023.
  16. Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 2017.
  17. Representation learning on graphs: Methods and applications. IEEE Data Eng. Bull., 40(3):52–74, 2017.
  18. Explanations as features: Llm-based features for text-attributed graphs. arXiv preprint arXiv:2305.19523, 2023.
  19. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
  20. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems, 33:22118–22133, 2020.
  21. Gpt-gnn: Generative pre-training of graph neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1857–1867, 2020.
  22. Prodigy: Enabling in-context learning over graphs. arXiv preprint arXiv:2305.12600, 2023.
  23. S. Ivanov and E. Burnaev. Anonymous walk embeddings. In International conference on machine learning, pages 2186–2195. PMLR, 2018.
  24. Pre-training on large-scale heterogeneous graph. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pages 756–766, 2021.
  25. Instruct and extract: Instruction tuning for on-demand information extraction. arXiv preprint arXiv:2310.16040, 2023.
  26. Mimic-iii, a freely accessible critical care database. Scientific data, 2016.
  27. T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
  28. T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations, 2017.
  29. Text generation from knowledge graphs with graph transformers. arXiv preprint arXiv:1904.02342, 2019.
  30. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461, 2019.
  31. Training graph neural networks with 1000 layers. In International conference on machine learning, pages 6437–6449. PMLR, 2021.
  32. A survey of graph meets large language model: Progress and future directions, 2024.
  33. C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81, 2004.
  34. One for all: Towards training one graph model for all classification tasks. ICLR, 2024.
  35. Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction. arXiv preprint arXiv:1808.09602, 2018.
  36. Hinormer: Representation learning on heterogeneous information networks with graph transformer. In Proceedings of the ACM Web Conference 2023, pages 599–610, 2023.
  37. Automating the construction of internet portals with machine learning. Information Retrieval, 3:127–163, 2000.
  38. M. McCloskey and N. J. Cohen. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation, volume 24, pages 109–165. Elsevier, 1989.
  39. S. Micali and Z. A. Zhu. Reconstructing markov processes from independent and anonymous experiments. Discrete Applied Mathematics, 200:108–122, 2016.
  40. On the stability of fine-tuning bert: Misconceptions, explanations, and strong baselines. In International Conference on Learning Representations, 2020.
  41. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
  42. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318, 2002.
  43. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 701–710, 2014.
  44. M. Popović. chrf++: words helping character n-grams. In Proceedings of the second conference on machine translation, pages 612–618, 2017.
  45. Can large language models empower molecular property prediction? arXiv preprint arXiv:2307.07443, 2023.
  46. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551, 2020.
  47. Modeling global and local node contexts for text generation from knowledge graphs. Transactions of the Association for Computational Linguistics, 8:589–604, 2020.
  48. Distilling reasoning capabilities into smaller language models. In Findings of the Association for Computational Linguistics: ACL 2023, pages 7059–7073, 2023.
  49. Gppt: Graph pre-training and prompt tuning to generalize graph neural networks. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1717–1727, 2022.
  50. All in one: Multi-task prompting for graph neural networks. Proceedings of the 29rd ACM SIGKDD international conference on knowledge discovery and data mining, 2023.
  51. Walklm: A uniform language model fine-tuning framework for attributed graph embedding. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  52. Graphgpt: Graph instruction tuning for large language models. arXiv preprint arXiv:2310.13023, 2023.
  53. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
  54. Graph attention networks. stat, 1050(20):10–48550, 2017.
  55. Deep Graph Infomax. In International Conference on Learning Representations, 2019.
  56. S. Vijay and A. Priyanshu. Nerda-con: Extending ner models for continual learning–integrating distinct tasks and updating distribution shifts. International Conference on Machine Learning, 2022.
  57. Can language models solve graph problems in natural language?, 2024.
  58. Microsoft academic graph: When experts are not enough. Quantitative Science Studies, 1(1):396–413, 2020.
  59. A survey on heterogeneous graph embedding: Methods, techniques, applications and sources. IEEE Trans. Big Data, 9(2):415–436, 2023.
  60. Heterogeneous graph attention network. In The world wide web conference, pages 2022–2032, 2019.
  61. How far can camels go? exploring the state of instruction tuning on open resources. arXiv preprint arXiv:2306.04751, 2023.
  62. Self-instruct: Aligning language model with self generated instructions. arXiv preprint arXiv:2212.10560, 2022.
  63. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022.
  64. Difformer: Scalable (graph) transformers induced by energy constrained diffusion. arXiv preprint arXiv:2301.09474, 2023.
  65. Nodeformer: A scalable graph structure learning transformer for node classification. Advances in Neural Information Processing Systems, 35:27387–27401, 2022.
  66. A comprehensive survey on graph neural networks. IEEE Trans. Neural Networks Learn. Syst., 32(1):4–24, 2021.
  67. Badchain: Backdoor chain-of-thought prompting for large language models. ICLR, 2024.
  68. Baichuan 2: Open large-scale language models. arXiv preprint arXiv:2309.10305, 2023.
  69. Geometric knowledge distillation: Topology compression for graph neural networks. Advances in Neural Information Processing Systems, 35:29761–29775, 2022.
  70. Relation learning on social networks with multi-modal graph edge variational autoencoders. In Proceedings of the 13th International Conference on Web Search and Data Mining, pages 699–707, 2020.
  71. Natural language is all a graph needs. arXiv preprint arXiv:2308.07134, 2023.
  72. S. Yuan and M. Färber. Evaluating generative models for graph-to-text generation. arXiv preprint arXiv:2307.14712, 2023.
  73. M. Zhang and Y. Chen. Link prediction based on graph neural networks. Advances in neural information processing systems, 31, 2018.
  74. Adaptive budget allocation for parameter-efficient fine-tuning. arXiv preprint arXiv:2303.10512, 2023.
  75. Graph-less neural networks: Teaching old mlps new tricks via distillation. arXiv preprint arXiv:2110.08727, 2021.
  76. Structure pretraining and prompt tuning for knowledge graph transfer. In Proceedings of the ACM Web Conference 2023, pages 2581–2590, 2023.
  77. Gimlet: A unified graph-text model for instruction-based molecule zero-shot learning. bioRxiv, pages 2023–05, 2023.
  78. Graph neural networks: A review of methods and applications. AI open, 1:57–81, 2020.
  79. Transfer learning of graph neural networks with ego-graph information maximization. Advances in Neural Information Processing Systems, 34:1766–1779, 2021.
  80. Graph contrastive learning with adaptive augmentation. In Proceedings of the Web Conference 2021, pages 2069–2080, 2021.
Citations (8)

Summary

  • The paper introduces MuseGraph, which employs graph-oriented instruction tuning to integrate LLMs with GNNs for effective graph mining.
  • It leverages a compact graph description mechanism combined with Chain-of-Thought-based instruction packages to efficiently encode graph structure and semantics.
  • MuseGraph demonstrates superior performance on node classification, link prediction, and graph-to-text generation across various datasets.

Graph-oriented Instruction Tuning of LLMs for Generic Graph Mining

The paper "Graph-oriented Instruction Tuning of LLMs for Generic Graph Mining" presents MuseGraph, a novel framework that effectively combines the capabilities of Graph Neural Networks (GNNs) and LLMs to tackle diverse graph mining tasks across various datasets. By leveraging instruction tuning specifically designed for graph-related data, MuseGraph introduces a unified approach to address the challenges of graph representation and reasoning in LLMs.

Introduction and Motivation

Graphs serve as fundamental structures for modeling interconnections among entities with rich attributes. Traditional GNNs excel at learning from these structures but face limitations when generalizing across different tasks and datasets without extensive retraining. The integration of LLMs, known for their proficiency in natural language tasks, holds potential for enhancing graph mining capabilities. However, the challenge lies in effectively bridging the gap between graph data and LLMs' proficiency in processing text, given the constraints of language token limits and the diversity of graph-related tasks (Figure 1). Figure 1

Figure 1: An illustrative toy example of the need for a generic graph model that can be directly applied to various graph-related tasks and datasets.

MuseGraph Framework

Compact Graph Description

The compact graph description mechanism is central to MuseGraph, enabling efficient encoding of graph information within LLM-compatible language tokens. This component uses a combination of neighbor and walk-based strategies to effectively capture semantic and structural aspects of graphs. The process leverages a "node energy" metric that considers token counts and node degrees to prioritize graph information within token limits (Figure 2). Figure 2

Figure 2: The overall of MuseGraph, which consists of Compact Graph Description, Diverse Instruction Generation, and Graph-aware Instruction Tuning.

Diverse Instruction Generation

To harness the reasoning capabilities of advanced LLMs such as GPT-4, MuseGraph implements a Chain-of-Thought (CoT)-based instruction package (Figure 3). This approach involves prompting GPT-4 with task-specific instructions to extract step-by-step reasoning, which is then distilled into instruction packages that enhance LLMs' understanding of graph data. This methodology contrasts with existing techniques by constructing CoT-based instructions directly, rather than merely using them for prompting. Figure 3

Figure 3: A process showing how to conduct the Chain-of-Thought (CoT)-based instruction package for node classification. We leverage the reasoning ability distilled from advanced LLMs (e.g., GPT-4) and integrate them with task-specific instructions via a 1:10 mix ratio.

Graph-aware Instruction Tuning

The graph-aware instruction tuning mechanism prevents catastrophic forgetting while accommodating task and dataset variability. MuseGraph employs dynamic instruction allocation strategies, balancing CoT-based instructions across tasks and datasets according to their complexity, ensuring comprehensive model training (Figure 4). Figure 4

Figure 4: Comprehensive performance of different models on various tasks and datasets.

Experimental Results

MuseGraph demonstrates superior performance across multiple graph mining tasks, outperforming state-of-the-art GNNs and LLMs in node classification, link prediction, and graph-to-text generation tasks. Its effectiveness is shown in tasks spanning various datasets like IMDB, Freebase, and more, highlighting its capacity for generalization and adaptability (Figure 5). Figure 5

Figure 5: Accuracy Results for the Reachability and Max Sum Path tasks on the graph with varying difficulty levels (i.e., D1 to D4), where ``D'' represents different degrees of complexity.

Conclusions and Future Work

MuseGraph offers a robust solution for generic graph mining by seamlessly unifying GNNs' representational strengths with LLMs' generative prowess. It opens avenues for deploying a single model capable of handling diverse graph-related tasks without extensive retraining. Future extensions can explore the integration of MuseGraph into a broader range of graph types and tasks, potentially extending its application to biological networks and knowledge graphs, enhancing versatility in real-world scenarios.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper:

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube