Noise-Robust Learning refers to training a model to effectively handle and adapt to noisy labels present in the data. Specifically, in joint entity and relation extraction tasks with distantly-labeled data, labels can be inaccurate due to misalignments between entity mentions and their corresponding tags from a knowledge base. This noise degrades the quality of the learning process. To address this, a noise-robust learning framework like DENRL incorporates strategies to regularize the model by leveraging meaningful patterns and dependencies, allowing the model to iteratively focus on cleaner data and mitigate the effects of noisy labels. This ensures the model remains effective and improves performance despite the presence of noise.

Posts

Distantly-Supervised Joint Extraction with Noise-Robust Learning

Joint entity and relation extraction is a process that identifies entity pairs and their relations using a single model. We focus on the problem of joint extraction in distantly-labeled data,whose labels are generated by aligning entity mentions with the corresponding entity and relation tags using a knowledge base (KB). One key challenge is the presence of noisy labels arising from both incorrect entity and relation annotations, which significantly impairs the quality of supervised learning. Existing approaches, either considering only one source of noise or making decisions using external knowledge, cannot well-utilize significant information in the training data. We propose DENRL, a generalizable framework that 1) incorporates a lightweight transformer backbone into a sequence labeling scheme for joint tagging, and 2) employs a noise-robust framework that regularizes the tagging model with significant relation patterns and entity-relation dependencies, then iteratively self-adapts to instances with less noise from both sources. Surprisingly, experiments1 on two benchmark datasets show that DENRL, using merely its own parametric distribution and simple data-driven heuristics, outperforms large language model-based baselines by a large margin with better interpretability.