Wenchao Yul NEC Labs America

Wenchao Yu

Researcher

Data Science and System Security

Posts

Skill Disentanglement for Imitation Learning from Suboptimal Demonstrations

Skill Disentanglement for Imitation Learning from Suboptimal Demonstrations Imitation learning has achieved great success in many sequential decision-making tasks, in which a neural agent is learned by imitating collected human demonstrations. However, existing algorithms typically require a large number of high-quality demonstrations that are difficult and expensive to collect. Usually, a trade-off needs to be made between demonstration quality and quantity in practice. Targeting this problem, in this work we consider the imitation of sub-optimal demonstrations, with both a small clean demonstration set and a large noisy set. Some pioneering works have been proposed, but they suffer from many limitations, e.g., assuming a demonstration to be of the same optimality throughout time steps and failing to provide any interpretation w.r.t knowledge learned from the noisy set. Addressing these problems, we propose method by evaluating and imitating at the sub-demonstration level, encoding action primitives of varying quality into different skills. Concretely, SDIL consists of a high-level controller to discover skills and a skill-conditioned module to capture action-taking policies and is trained following a two-phase pipeline by first discovering skills with all demonstrations and then adapting the controller to only the clean set. A mutual-information-based regularization and a dynamic sub-demonstration optimality estimator are designed to promote disentanglement in the skill space. Extensive experiments are conducted over two gym environments and a real-world healthcare dataset to demonstrate the superiority of SDIL in learning from sub-optimal demonstrations and its improved interpretability by examining learned skills.

FedSkill: Privacy Preserved Interpretable Skill Learning via Imitation

FedSkill: Privacy Preserved Interpretable Skill Learning via Imitation Imitation learning that replicates experts’ skills via their demonstrations has shown significant success in various decision-making tasks. However, two critical challenges still hinder the deployment of imitation learning techniques in real-world application scenarios. First, existing methods lack the intrinsic interpretability to explicitly explain the underlying rationale of the learned skill and thus making learned policy untrustworthy. Second, due to the scarcity of expert demonstrations from each end user (client), learning a policy based on different data silos is necessary but challenging in privacy-sensitive applications such as finance and healthcare. To this end, we present a privacy-preserved interpretable skill learning framework (FedSkill) that enables global policy learning to incorporate data from different sources and provides explainable interpretations to each local user without violating privacy and data sovereignty. Specifically, our proposed interpretable skill learning model can capture the varying patterns in the trajectories of expert demonstrations, and extract prototypical information as skills that provide implicit guidance for policy learning and explicit explanations in the reasoning process. Moreover, we design a novel aggregation mechanism coupled with the based skill learning model to preserve global information utilization and maintain local interpretability under the federated framework. Thoroughly experiments on three datasets and empirical studies demonstrate that our proposed FedSkill framework not only outperforms state-of-the-art imitation learning methods but also exhibits good interpretability under a federated setting. Our proposed FedSkill framework is the first attempt to bridge the gaps among federated learning, interpretable machine learning, and imitation learning.

Personalized Federated Learning under Mixture Distributions

Personalized Federated Learning under Mixture Distributions The recent trend towards Personalized Federated Learning (PFL) has garnered significant attention as it allows for the training of models that are tailored to each client while maintaining data privacy. However, current PFL techniques primarily focus on modeling the conditional distribution heterogeneity (i.e. concept shift), which can result in suboptimal performance when the distribution of input data across clients diverges (i.e. covariate shift). Additionally, these techniques often lack the ability to adapt to unseen data, further limiting their effectiveness in real-world scenarios. To address these limitations, we propose a novel approach, FedGMM, which utilizes Gaussian mixture models (GMM) to effectively fit the input data distributions across diverse clients. The model parameters are estimated by maximum likelihood estimation utilizing a federated Expectation-Maximization algorithm, which is solved in closed form and does not assume gradient similarity. Furthermore, FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification. Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.

Interpretable Skill Learning for Dynamic Treatment Regimes through Imitation

Interpretable Skill Learning for Dynamic Treatment Regimes through Imitation Imitation learning that mimics experts’ skills from their demonstrations has shown great success in discovering dynamic treatment regimes, i.e., the optimal decision rules to treat an individual patient based on related evolving treatment and covariate history. Existing imitation learning methods, however, still lack the capability to interpret the underlying rationales of the learned policy in a faithful way. Moreover, since dynamic treatment regimes for patients often exhibit varying patterns, i.e., symptoms that transit from one to another, the flat policy learned by a vanilla imitation learning method is typically undesired. To this end, we propose an Interpretable Skill Learning (ISL) framework to resolve the aforementioned challenges for dynamic treatment regimes through imitation. The key idea is to model each segment of experts’ demonstrations with a prototype layer and integrate it with the imitation learning layer to enhance the interpretation capability. On one hand, the ISL framework is able to provide interpretable explanations by matching the prototype to exemplar segments during the inference stage, which enables doctors to perform reasoning of the learned demonstrations based on human-understandable patient symptoms and lab results. On the other hand, the obtained skill embedding consisting of prototypes serves as conditional information to the imitation learning layer, which implicitly guides the policy network to provide a more accurate demonstration when the patients’ state switches from one stage to another. Thoroughly empirical studies demonstrate that our proposed ISL technique can achieve better performance than state-of-the-art methods. Moreover, the proposed ISL framework also exhibits good interpretability which cannot be observed in existing methods.

Time Series Contrastive Learning with Information-Aware Augmentations

Time Series Contrastive Learning with Information-Aware Augmentations Various contrastive learning approaches have been proposed in recent years and achieve significant empirical success. While effective and prevalent, contrastive learning has been less explored for time series data. A key component of contrastive learning is to select appropriate augmentations imposing some priors to construct feasible positive samples, such that an encoder can be trained to learn robust and discriminative representations. Unlike image and language domains where “desired” augmented samples can be generated with the rule of thumb guided by prefabricated human priors, the ad-hoc manual selection of time series augmentations is hindered by their diverse and human-unrecognizable temporal structures. How to find the desired augmentations of time series data that are meaningful for given contrastive learning tasks and datasets remains an open question. In this work, we address the problem by encouraging both high fidelity and variety based on information theory. A theoretical analysis leads to the criteria for selecting feasible data augmentations. On top of that, we propose a new contrastive learning approach with information-aware augmentations, InfoTS, that adaptively selects optimal augmentations for time series representation learning. Experiments on various datasets show highly competitive performance with up to a 12.0% reduction in MSE on forecasting tasks and up to 3.7% relative improvement in accuracy on classification tasks over the leading baselines.

Personalized Federated Learning via Heterogeneous Modular Networks

Personalized Federated Learning via Heterogeneous Modular Networks Personalized Federated Learning (PFL) which collaboratively trains a federated model while considering local clients under privacy constraints has attracted much attention. Despite its popularity, it has been observed that existing PFL approaches result in sub-optimal solutions when the joint distribution among local clients diverges. To address this issue, we present Federated Modular Network (FedMN), a novel PFL approach that adaptively selects sub-modules from a module pool to assemble heterogeneous neural architectures for different clients. FedMN adopts a light-weighted routing hypernetwork to model the joint distribution on each client and produce the personalized selection of the module blocks for each client. To reduce the communication burden in existing FL, we develop an efficient way to interact between the clients and the server. We conduct extensive experiments on the real-world test beds and the results show both effectiveness and efficiency of the proposed FedMN over the baselines.

SEED: Sound Event Early Detection via Evidential Uncertainty

SEED: Sound Event Early Detection via Evidential Uncertainty Sound Event Early Detection (SEED) is an essential task in recognizing the acoustic environments and soundscapes. However, most of the existing methods focus on the offline sound event detection, which suffers from the over-confidence issue of early-stage event detection and usually yield unreliable results. To solve the problem, we propose a novel Polyphonic Evidential Neural Network (PENet) to model the evidential uncertainty of the class probability with Beta distribution. Specifically, we use a Beta distribution to model the distribution of class probabilities, and the evidential uncertainty enriches uncertainty representation with evidence information, which plays a central role in reliable prediction. To further improve the event detection performance, we design the backtrack inference method that utilizes both the forward and backward audio features of an ongoing event. Experiments on the DESED database show that the proposed method can simultaneously improve 13.0% and 3.8% in time delay and detection F1 score compared to the state-of-the-art methods.

Dynamic Causal Discovery in Imitation Learning

Dynamic Causal Discovery in Imitation Learning Using deep reinforcement learning (DRL) to recover expert policies via imitation has been found to be promising in a wide range of applications. However, it remains a difficult task to interpret the control policy learned by the agent. Difficulties mainly come from two aspects: 1) agents in DRL are usually implemented as deep neural networks (DNNs), which are black-box models and lack in interpretability, 2) the latent causal mechanism behind agents’ decisions may vary along the trajectory, rather than staying static throughout time steps. To address these difficulties, in this paper, we propose a self-explaining imitation framework, which can expose causal relations among states and action variables behind its decisions. Specifically, a dynamic causal discovery module is designed to extract the causal graph basing on historical trajectory and current states at each time step, and a causality encoding module is designed to model the interactions among variables with discovered causal edges. After encoding causality into variable embeddings, a prediction model conducts the imitation learning on top of obtained representations. These three components are trained end-to-end, and discovered causal edges can provide interpretations on rules captured by the agent. Comprehensive experiments are conducted on the simulation dataset to analyze its causal discovery capacity, and we further test it on a real-world medical dataset MIMIC-IV. Experimental results demonstrate its potential of providing explanations behind decisions.

You Are What and Where You Are: Graph Enhanced Attention Network for Explainable POI Recommendation

You Are What and Where You Are: Graph Enhanced Attention Network for Explainable POI Recommendation Point-of-interest (POI) recommendation is an emerging area of research on location-based social networks to analyze user behaviors and contextual check-in information. For this problem, existing approaches, with shallow or deep architectures, have two major drawbacks. First, for these approaches, the attributes of individuals have been largely ignored. Therefore, it would be hard, if not impossible, to gather sufficient user attribute features to have complete coverage of possible motivation factors. Second, most existing models preserve the information of users or POIs by latent representations without explicitly highlighting salient factors or signals. Consequently, the trained models with unjustifiable parameters provide few persuasive rationales to explain why users favor or dislike certain POIs and what really causes a visit. To overcome these drawbacks, we propose GEAPR, a POI recommender that is able to interpret the POI prediction in an end-to-end fashion. Specifically, GEAPR learns user representations by aggregating different factors, such as structural context, neighbor impact, user attributes, and geolocation influence. GEAPR takes advantage of a triple attention mechanism to quantify the influences of different factors for each resulting recommendation and performs a thorough analysis of the model interpretability. Extensive experiments on real-world datasets demonstrate the effectiveness of the proposed model. GEAPR is deployed and under test on an internal web server. An example interface is presented to showcase its application on explainable POI recommendation.

Towards Robustness of Deep Neural Networks via Networks via Regularization

Towards Robustness of Deep Neural Networks via Networks via Regularization Recent studies have demonstrated the vulnerability of deep neural networks against adversarial examples. In-spired by the observation that adversarial examples often lie outside the natural image data manifold and the intrinsic dimension of image data is much smaller than its pixel space dimension, we propose to embed high-dimensional input images into a low-dimensional space and apply regularization on the embedding space to push the adversarial examples back to the manifold. The proposed framework is called Embedding Regularized Classifier (ER-Classifier), which improves the adversarial robustness of the classifier through embedding regularization. Besides improving classification accuracy against adversarial examples, the framework can be combined with detection methods to detect adversarial examples. Experimental results on several benchmark datasets show that, our proposed framework achieves good performance against strong adversarial at-tack methods.