Ziqi Chen is a former research intern in the Machine Learning department of  NEC Laboratories America, Inc. while studying at The Ohio State University.


T-Cell Receptor Optimization with Reinforcement Learning and Mutation Polices for Precision Immunotherapy

T cells monitor the health status of cells by identifying foreign peptides displayed on their surface. T-cell receptors (TCRs), which are protein complexes found on the surface of T cells, are able to bind to these peptides. This process is known as TCR recognition and constitutes a key step for immune response. Optimizing TCR sequences for TCR recognition represents a fundamental step towards the development of personalized treatments to trigger immune responses killing cancerous or virus-infected cells. In this paper, we formulated the search for these optimized TCRs as a reinforcement learning (RL) problem and presented a framework TCRPPO with a mutation policy using proximal policy optimization. TCRPPO mutates TCRs into effective ones that can recognize given peptides. TCRPPO leverages a reward function that combines the likelihoods of mutated sequences being valid TCRs measured by a new scoring function based on deep autoencoders, with the probabilities of mutated sequences recognizing peptides from a peptide-TCR interaction predictor. We compared TCRPPO with multiple baseline methods and demonstrated that TCRPPO significantly outperforms all the baseline methods to generate positive binding and valid TCRs. These results demonstrate the potential of TCRPPO for both precision immunotherapy and peptide-recognizing TCR motif discovery.

Binding Peptide Generation for MHC Class I Proteins with Deep Reinforcement Learning

Motivation: MHC Class I protein plays an important role in immunotherapy by presenting immunogenic peptides to anti-tumor immune cells. The repertoires of peptides for various MHC Class I proteins are distinct, which can be reflected by their diverse binding motifs. To characterize binding motifs for MHC Class I proteins, in vitro experiments have been conducted to screen peptides with high binding affinities to hundreds of given MHC Class I proteins. However, considering tens of thousands of known MHC Class I proteins, conducting in vitro experiments for extensive MHC proteins is infeasible, and thus a more efficient and scalable way to characterize binding motifs is needed.Results: We presented a de novo generation framework, coined PepPPO, to characterize binding motif for any given MHC Class I proteins via generating repertoires of peptides presented by them. PepPPO leverages a reinforcement learning agent with a mutation policy to mutate random input peptides into positive presented ones. Using PepPPO, we characterized binding motifs for around 10 000 known human MHC Class I proteins with and without experimental for the rapid screening of neoantigens at a much lower time cost than previous deep-learning methods.

A Deep Generative Model for Molecule Optimization via One Fragment Modification

Molecule optimization is a critical step in drug development to improve the desired properties of drug candidates through chemical modification. We have developed a novel deep generative model, Modof, over molecular graphs for molecule optimization. Modof modifies a given molecule through the prediction of a single site of disconnection at the molecule and the removal and/or addition of fragments at that site. A pipeline of multiple, identical Modof models is implemented into Modof-pipe to modify an input molecule at multiple disconnection sites. Here we show that Modof-pipe is able to retain major molecular scaffolds, allow controls over intermediate optimization steps and better constrain molecule similarities. Modof-pipe outperforms the state-of-the-art methods on benchmark datasets. Without molecular similarity constraints, Modof-pipe achieves 81.2% improvement in the octanol–water partition coefficient, penalized by synthetic accessibility and ring size, and 51.2%, 25.6% and 9.2% improvement if the optimized molecules are at least 0.2, 0.4 and 0.6 similar to those before optimization, respectively. Modof-pipe is further enhanced into Modof-pipem to allow modification of one molecule to multiple optimized ones. Modof-pipem achieves additional performance improvement, at least 17.8% better than Modof-pipe.

Ranking-based Convolutional Neural Network Models for Peptide-MHC Binding Prediction

T-cell receptors can recognize foreign peptides bound to major histocompatibility complex (MHC) class-I proteins, and thus trigger the adaptive immune response. Therefore, identifying peptides that can bind to MHC class-I molecules plays a vital role in the design of peptide vaccines. Many computational methods, for example, the state-of-the-art allele-specific method MHCflurry, have been developed to predict the binding affinities between peptides and MHC molecules. In this manuscript, we develop two allele-specific Convolutional Neural Network-based methods named ConvM and SpConvM to tackle the binding prediction problem. Specifically, we formulate the problem as to optimize the rankings of peptide-MHC bindings via ranking-based learning objectives. Such optimization is more robust and tolerant to the measurement inaccuracy of binding affinities, and therefore enables more accurate prioritization of binding peptides. In addition, we develop a new position encoding method in ConvM and SpConvM to better identify the most important amino acids for the binding events. We conduct a comprehensive set of experiments using the latest Immune Epitope Database (IEDB) datasets. Our experimental results demonstrate that our models significantly outperform the state-of-the-art methods including MHCflurry with an average percentage improvement of 6.70% on AUC and 17.10% on ROC5 across 128 alleles.