Deep Reinforcement Learning (DRL) is a subfield of deep learning and reinforcement learning (RL) that combines neural networks (deep learning) with reinforcement learning algorithms. DRL models learn to make sequential decisions by interacting with an environment. They use deep neural networks, typically deep Q-networks (DQNs) or policy gradients, to approximate value functions or policies, and they have been particularly successful in areas such as game playing (e.g., AlphaGo) and robotics. DRL models aim to learn optimal strategies that maximize cumulative rewards in various tasks.


Binding Peptide Generation for MHC Class I Proteins with Deep Reinforcement Learning

Binding Peptide Generation for MHC Class I Proteins with Deep Reinforcement Learning Motivation: MHC Class I protein plays an important role in immunotherapy by presenting immunogenic peptides to anti-tumor immune cells. The repertoires of peptides for various MHC Class I proteins are distinct, which can be reflected by their diverse binding motifs. To characterize binding motifs for MHC Class I proteins, in vitro experiments have been conducted to screen peptides with high binding affinities to hundreds of given MHC Class I proteins. However, considering tens of thousands of known MHC Class I proteins, conducting in vitro experiments for extensive MHC proteins is infeasible, and thus a more efficient and scalable way to characterize binding motifs is needed.Results: We presented a de novo generation framework, coined PepPPO, to characterize binding motif for any given MHC Class I proteins via generating repertoires of peptides presented by them. PepPPO leverages a reinforcement learning agent with a mutation policy to mutate random input peptides into positive presented ones. Using PepPPO, we characterized binding motifs for around 10 000 known human MHC Class I proteins with and without experimental for the rapid screening of neoantigens at a much lower time cost than previous deep-learning methods.