InfuserKI: Enhancing Large Language Models with Knowledge Graphs via Infuser-Guided Knowledge Integration (EMNLP 2024)
Publication Date: 11/13/2024
Event: The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024)
Reference: pp. 3675-3688, 2024
Authors: Fali Wang, The Pennsylvania State University; Runxue Bao, GE HealthCare; Suhang Wang, The Pennsylvania State University; Wenchao Yu, NEC Laboratories America, Inc.; Yanchi Liu, NEC Laboratories America, Inc.; Wei Cheng, NEC Laboratories America, Inc.; Haifeng Chen, NEC Laboratories America, Inc.
Abstract: Large Language Models (LLMs) have achieved exceptional capabilities in open generation across various domains, yet they encounter difficulties with tasks that require intensive knowledge. To address these challenges, methods for integrating knowledge have been developed, which augment LLMs with domain-specific knowledge graphs through external modules. These approaches, however, face data inefficiency issues as they necessitate the processing of both known and unknown knowledge for fine-tuning. Thus, our research focuses on a novel problem: efficiently integrating unknown knowledge into LLMs without unnecessary overlap of known knowledge. A risk of introducing new knowledge is the potential forgetting of existing knowledge. To mitigate this risk, we propose the innovative InfuserKI framework. This framework employs transformer internal states to determine when to enrich LLM outputs with additional information, effectively preventing knowledge forgetting. Performance evaluations using the UMLS-2.5k and MetaQA domain knowledge graphs reveal that InfuserKI not only successfully integrates new knowledge but also outperforms state-of-the-art baselines, reducing knowledge forgetting by 9% and 6%, respectively.
Publication Link: https://aclanthology.org/2024.findings-emnlp.209/