Multi-Task Recurrent Modular Networks

Publication Date: 3/9/2021

Event: AAAI 2021 – 35th AAAI Conference on Artificial Intelligence

Reference: 1-9, 2021 (Virtual Conferencce)

Authors: Dongkuan Xu, Pennsylvania State University; Wei Cheng, NEC Laboratories America, Inc.; Xin Dong, Rutgers University; Bo Zong, NEC Laboratories America, Inc.; Wenchao Yu, NEC Laboratories America, Inc.; Jingchao Ni, NEC Laboratories America, Inc.; Dongjin Song, University of Connecticut; Xuchao Zhang, NEC Laboratories America, Inc.; Xiang Zhang, Pennsylvania State University; Haifeng Chen, NEC Laboratories America, Inc.

Abstract: We consider the models of deep multi-task learning with recurrent architectures that exploit regularities across tasks to improve the performance of multiple sequence processing tasks jointly. Most existing architectures are painstakingly customized to learn task relationships for different problems, which is not flexible enough to model the dynamic task relationships and lacks generalization abilities to novel test-time scenarios. We propose multi-task recurrent modular networks (MT-RMN) that can be incorporated in any multi-task recurrent models to address the above drawbacks. MT-RMN consists of a shared encoder and multiple task-specific decoders, and recurrently operates over time. For better flexibility, it modularizes the encoder into multiple layers of sub-networks and dynamically controls the connection between these sub-networks and the decoders at different time steps, which provides the recurrent networks with varying degrees of parameter sharing for tasks with dynamic relatedness. For the generalization ability, MT-RMN aims to discover a set of generalizable sub-networks in the encoder that are assembled in different ways for different tasks. The policy networks augmented with the differentiable routers are utilized to make the binary connection decisions between the sub-networks. The experimental results on three multi-task sequence processing datasets consistently demonstrate the effectiveness of MT-RMN.

Publication Link: