Publication Date: 3/1/2022
Event: Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-2022)
Reference: 1-9. 2022 – Virtual Conference
Authors: Liyan Xu, Emory University; Xuchao Zhang, NEC Laboratories America, Inc.; Bo Zong, Salesforce; Yanchi Liu, NEC Laboratories America, Inc.; Wei Cheng, NEC Laboratories America, Inc.; Jingchao Ni, NEC Laboratories America, Inc.; Haifeng Chen, NEC Laboratories America, Inc.; Liang Zhao, Emory University; Jinho D. Choi, Emory University
Abstract: We target the task of cross-lingual Machine Reading Comprehension (MRC) in the direct zero-shot setting, by incorporating syntactic features from Universal Dependencies (UD), and the key features we use are the syntactic relations within each sentence. While previous work has demonstrated effective syntax-guided MRC models, we propose to adopt the inter-sentence syntactic relations, in addition to the rudimentary intra-sentence relations, to further utilize the syntactic dependencies in the multi-sentence input of the MRC task. In our approach, we build the Inter-Sentence Dependency Graph (ISDG) connecting dependency trees to form global syntactic relations across sentences. We then propose the ISDG encoder that encodes the global dependency graph, addressing the inter-sentence relations via both one-hop and multi-hop dependency paths explicitly. Experiments on three multilingual MRC datasets (XQuAD, MLQA, TyDiQA-GoldP) show that our encoder that is only trained on English is able to improve the zero-shot performance on all 14 test sets covering 8 languages, with up to 3.8 F1 / 5.2 EM improvement on-average, and 5.2 F1 / 11.2 EM on certain languages. Further analysis shows the improvement can be attributed to the attention on the cross-linguistically consistent syntactic path. Our code is available at https://github.com/lxucs/multilingual-mrc-isdg.