Posts

Prompt-based Domain Discrimination for Multi-source Time Series Domain Adaptation

Time series domain adaptation stands as a pivotal and intricate challenge with diverse applications, including but not limited to human activity recognition, sleep stage classification, and machine fault diagnosis. Despite the numerous domain adaptation techniques proposed to tackle this complex problem, their primary focus has been on the common representations of time series data. This concentration might inadvertently lead to the oversight of valuable domain-specific information originating from different source domains. To bridge this gap, we introduce POND, a novel prompt-based deep learning model designed explicitly for multi-source time series domain adaptation. POND is tailored to address significant challenges, notably: 1) The unavailability of a quantitative relationship between meta-data information and time series distributions, and 2) The dearth of exploration into extracting domain specific meta-data information. In this paper, we present an instance-level prompt generator and afidelity loss mechanism to facilitate the faithful learning of meta-data information. Additionally, we propose a domain discrimination technique to discern domain-specific meta-data information from multiple source domains. Our approach involves a simple yet effective meta-learning algorithm to optimize the objective efficiently. Furthermore, we augment the model’s performance by incorporating the Mixture of Expert (MoE) technique. The efficacy and robustness of our proposed POND model are extensively validated through experiments across 50 scenarios encompassing five datasets, which demonstrates that our proposed POND model outperforms the state-of the-art methods by up to 66% on the F1-score.

Open-ended Commonsense Reasoning with Unrestricted Answer Scope

Open-ended Commonsense Reasoning is defined as solving a commonsense question without providing 1) a short list of answer candidates and 2) a pre-defined answer scope. Conventional ways of formulating the commonsense question into a question-answering form or utilizing external knowledge to learn retrieval-based methods are less applicable in the open-ended setting due to an inherent challenge. Without pre-defining an answer scope or a few candidates, open-ended commonsense reasoning entails predicting answers by searching over an extremely large searching space. Moreover, most questions require implicit multi-hop reasoning, which presents even more challenges to our problem. In this work, we leverage pre-trained language models to iteratively retrieve reasoning paths on the external knowledge base, which does not require task-specific supervision. The reasoning paths can help to identify the most precise answer to the commonsense question. We conduct experiments on two commonsense benchmark datasets. Compared to other approaches, our proposed method achieves better performance both quantitatively and qualitatively.

Beyond One Model Fits All: A Survey of Domain Specialization for Large Language Models

Beyond One Model Fits All: A Survey of Domain Specialization for Large Language Models Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task agnostic foundation for a wide range of applications. The great promise of LLMs as general task solvers motivated people to extend their functionality largely beyond just a “chatbot”, and use it as an assistant or even replacement for domain experts and tools in specific domains such as healthcare, finance, and education. However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles, caused by the heterogeneity of domain data, the sophistication of domain knowledge, the uniqueness of domain objectives, and the diversity of the constraints (e.g., various social norms, cultural conformity, religious beliefs, and ethical standards in the domain applications). To fill such a gap, explosively increase research, and practices have been conducted in very recent years on the domain specialization of LLMs, which, however, calls for a comprehensive and systematic review to better summarizes and guide this promising domain. In this survey paper, first, we propose a systematic taxonomy that categorizes the LLM domain specialization techniques based on the accessibility to LLMs and summarizes the framework for all the subcategories as well as their relations and differences to each other. We also present a comprehensive taxonomy of critical application domains that can benefit from specialized LLMs, discussing their practical significance and open challenges. Furthermore, we offer insights into the current research status and future trends in this area.

DeepGAR: Deep Graph Learning for Analogical Reasoning

DeepGAR: Deep Graph Learning for Analogical Reasoning Analogical reasoning is the process of discovering and mapping correspondences from a target subject to a base subject. As the most well-known computational method of analogical reasoning, Structure-Mapping Theory (SMT) abstracts both target and base subjects into relational graphs and forms the cognitive process of analogical reasoning by finding a corresponding subgraph (i.e., correspondence) in the target graph that is aligned with the base graph. However, incorporating deep learning for SMT is still under-explored due to several obstacles: 1) the combinatorial complexity of searching for the correspondence in the target graph, 2) the correspondence mining is restricted by various cognitive theory-driven constraints. To address both challenges, we propose a novel framework for Analogical Reasoning (DeepGAR) that identifies the correspondence between source and target domains by assuring cognitive theory-driven constraints. Specifically, we design a geometric constraint embedding space to induce subgraph relation from node embeddings for efficient subgraph search. Furthermore, we develop novel learning and optimization strategies that could end-to-end identify correspondences that are strictly consistent with constraints driven by the cognitive theory. Extensive experiments are conducted on synthetic and real-world datasets to demonstrate the effectiveness of the proposed DeepGAR over existing methods. The code and data are available at: https://github.com/triplej0079/DeepGAR.

Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph

Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph We target the task of cross-lingual Machine Reading Comprehension (MRC) in the direct zero-shot setting, by incorporating syntactic features from Universal Dependencies (UD), and the key features we use are the syntactic relations within each sentence. While previous work has demonstrated effective syntax-guided MRC models, we propose to adopt the inter-sentence syntactic relations, in addition to the rudimentary intra-sentence relations, to further utilize the syntactic dependencies in the multi-sentence input of the MRC task. In our approach, we build the Inter-Sentence Dependency Graph (ISDG) connecting dependency trees to form global syntactic relations across sentences. We then propose the ISDG encoder that encodes the global dependency graph, addressing the inter-sentence relations via both one-hop and multi-hop dependency paths explicitly. Experiments on three multilingual MRC datasets (XQuAD, MLQA, TyDiQA-GoldP) show that our encoder that is only trained on English is able to improve the zero-shot performance on all 14 test sets covering 8 languages, with up to 3.8 F1 / 5.2 EM improvement on-average, and 5.2 F1 / 11.2 EM on certain languages. Further analysis shows the improvement can be attributed to the attention on the cross-linguistically consistent syntactic path. Our code is available at https://github.com/lxucs/multilingual-mrc-isdg.