Multi-hop Question Answering under Temporal Knowledge Editing (2404.00492v1)
Abstract: Multi-hop question answering (MQA) under knowledge editing (KE) has garnered significant attention in the era of LLMs. However, existing models for MQA under KE exhibit poor performance when dealing with questions containing explicit temporal contexts. To address this limitation, we propose a novel framework, namely TEMPoral knowLEdge augmented Multi-hop Question Answering (TEMPLE-MQA). Unlike previous methods, TEMPLE-MQA first constructs a time-aware graph (TAG) to store edit knowledge in a structured manner. Then, through our proposed inference path, structural retrieval, and joint reasoning stages, TEMPLE-MQA effectively discerns temporal contexts within the question query. Experiments on benchmark datasets demonstrate that TEMPLE-MQA significantly outperforms baseline models. Additionally, we contribute a new dataset, namely TKEMQA, which serves as the inaugural benchmark tailored specifically for MQA with temporal scopes.
- Fine-grained named entity typing over distantly supervised data based on refined representations. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34(05), pp. 7391–7398, 2020.
- Fine-grained named entity typing over distantly supervised data via refinement in hyperbolic space. arXiv preprint arXiv:2101.11212, 2021.
- Editing factual knowledge in language models. In Conference on Empirical Methods in Natural Language Processing, 2021. URL https://api.semanticscholar.org/CorpusID:233289412.
- Recall and learn: Fine-tuning deep pretrained language models with less forgetting. In Conference on Empirical Methods in Natural Language Processing, 2020. URL https://api.semanticscholar.org/CorpusID:216553067.
- Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality, March 2023. URL https://lmsys.org/blog/2023-03-30-vicuna/.
- Evaluating the ripple effects of knowledge editing in language models. ArXiv, abs/2307.12976, 2023. URL https://api.semanticscholar.org/CorpusID:260356612.
- Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, pp. 265–284. Springer, 2006.
- Pokemqa: Programmable knowledge editing for multi-hop question answering. arXiv preprint arXiv:2312.15194, 2023.
- Aging with grace: Lifelong model editing with discrete key-value adaptors. ArXiv, abs/2211.11031, 2022. URL https://api.semanticscholar.org/CorpusID:253735429.
- Do language models have beliefs? methods for detecting, updating, and visualizing model beliefs. ArXiv, abs/2111.13654, 2021. URL https://api.semanticscholar.org/CorpusID:244709666.
- Faithful question answering with monte-carlo planning. In Annual Meeting of the Association for Computational Linguistics, 2023. URL https://api.semanticscholar.org/CorpusID:258479954.
- Wilke: Wise-layer knowledge editor for lifelong knowledge editing. ArXiv, abs/2402.10987, 2024. URL https://api.semanticscholar.org/CorpusID:267751068.
- High dimensional differentially private stochastic optimization with heavy-tailed data. In Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pp. 227–236, 2022.
- Differentially private natural language models: Recent advances and future directions. arXiv preprint arXiv:2301.09112, 2023a.
- Seat: stable and explainable attention. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37(11), pp. 12907–12915, 2023b.
- Privacy-preserving sparse generalized eigenvalue problem. In International Conference on Artificial Intelligence and Statistics, pp. 5052–5062. PMLR, 2023c.
- Meta-learning online adaptation of language models. ArXiv, abs/2305.15076, 2023d. URL https://api.semanticscholar.org/CorpusID:258866057.
- Jie Huang and Kevin Chen-Chuan Chang. Towards reasoning in large language models: A survey. ArXiv, abs/2212.10403, 2022. URL https://api.semanticscholar.org/CorpusID:254877753.
- See the unseen: Better context-consistent knowledge-editing by noises. ArXiv, abs/2401.07544, 2024. URL https://api.semanticscholar.org/CorpusID:266999106.
- Dense passage retrieval for open-domain question answering. ArXiv, abs/2004.04906, 2020. URL https://api.semanticscholar.org/CorpusID:215737187.
- Decomposed prompting: A modular approach for solving complex tasks. ArXiv, abs/2210.02406, 2022. URL https://api.semanticscholar.org/CorpusID:252715485.
- Faithful vision-language interpretation via concept bottleneck models. In The Twelfth International Conference on Learning Representations, 2023.
- Pmet: Precise model editing in a transformer. ArXiv, abs/2308.08742, 2023. URL https://api.semanticscholar.org/CorpusID:261030625.
- Fine-grained entity recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 26(1), pp. 94–100, 2012.
- Reasoning on graphs: Faithful and interpretable large language model reasoning. ArXiv, abs/2310.01061, 2023. URL https://api.semanticscholar.org/CorpusID:263605944.
- Locating and editing factual associations in gpt. Advances in Neural Information Processing Systems, 35:17359–17372, 2022a.
- Mass-editing memory in a transformer. In The Eleventh International Conference on Learning Representations, 2022b.
- Fast model editing at scale. ArXiv, abs/2110.11309, 2021. URL https://api.semanticscholar.org/CorpusID:239050360.
- Memory-based model editing at scale. ArXiv, abs/2206.06520, 2022. URL https://api.semanticscholar.org/CorpusID:249642147.
- Van L Parsons. Stratified sampling. Wiley StatsRef: Statistics Reference Online, pp. 1–11, 2014.
- Faster rates of private stochastic convex optimization. In International Conference on Algorithmic Learning Theory, pp. 995–1002. PMLR, 2022.
- Llama 2: Open foundation and fine-tuned chat models. ArXiv, abs/2307.09288, 2023. URL https://api.semanticscholar.org/CorpusID:259950998.
- GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax, May 2021.
- Differentially private (gradient) expectation maximization algorithm with statistical guarantees. arXiv preprint arXiv:2010.13520, 2020.
- Estimating smooth glm in non-interactive local differential privacy model with public unlabeled data. In Algorithmic Learning Theory, pp. 1207–1213. PMLR, 2021.
- Plan-and-solve prompting: Improving zero-shot chain-of-thought reasoning by large language models. In Annual Meeting of the Association for Computational Linguistics, 2023a. URL https://api.semanticscholar.org/CorpusID:258558102.
- Knowledge editing for large language models: A survey. arXiv preprint arXiv:2310.16218, 2023b.
- Deepedit: Knowledge editing as decoding with constraints. ArXiv, abs/2401.10471, 2024. URL https://api.semanticscholar.org/CorpusID:267060897.
- Practical differentially private and byzantine-resilient federated learning. Proceedings of the ACM on Management of Data, 1(2):1–26, 2023.
- How does selection leak privacy: Revisiting private selection and improved results for hyper-parameter tuning. arXiv preprint arXiv:2402.13087, 2024.
- An llm can fool itself: A prompt-based adversarial attack. arXiv preprint arXiv:2310.13345, 2023.
- Moral: Moe augmented lora for llms’ lifelong learning. arXiv preprint arXiv:2402.11260, 2024a.
- Human-ai interactions in the communication era: Autophagy makes large models achieving local optima. arXiv preprint arXiv:2402.11271, 2024b.
- History matters: Temporal knowledge editing in large language model. ArXiv, abs/2312.05497, 2023. URL https://api.semanticscholar.org/CorpusID:266164354.
- A comprehensive study of knowledge editing for large language models. ArXiv, abs/2401.01286, 2024. URL https://api.semanticscholar.org/CorpusID:266725300.
- A survey of large language models. ArXiv, abs/2303.18223, 2023. URL https://api.semanticscholar.org/CorpusID:257900969.
- Can we edit factual knowledge by in-context learning? ArXiv, abs/2305.12740, 2023. URL https://api.semanticscholar.org/CorpusID:258832407.
- Mquake: Assessing knowledge editing in language models via multi-hop questions. arXiv preprint arXiv:2305.14795, 2023.
- Modifying memories in transformer models. ArXiv, abs/2012.00363, 2020. URL https://api.semanticscholar.org/CorpusID:227238659.