T-cell Receptor (TCR) is a specialized protein that is located on the surface of T-cells, which are a type of white blood cell crucial for the immune system. TCRs play a fundamental role in the adaptive immune response by allowing T-cells to recognize and respond to specific antigens. Antigens are molecular structures found on the surface of pathogens (such as viruses or bacteria) or abnormal cells. TCRs are part of the T-cell’s membrane and are involved in binding to specific antigens, triggering signaling pathways that lead to immune responses.


T-Cell Receptor Optimization with Reinforcement Learning and Mutation Polices for Precision Immunotherapy

T cells monitor the health status of cells by identifying foreign peptides displayed on their surface. T-cell receptors (TCRs), which are protein complexes found on the surface of T cells, are able to bind to these peptides. This process is known as TCR recognition and constitutes a key step for immune response. Optimizing TCR sequences for TCR recognition represents a fundamental step towards the development of personalized treatments to trigger immune responses killing cancerous or virus-infected cells. In this paper, we formulated the search for these optimized TCRs as a reinforcement learning (RL) problem and presented a framework TCRPPO with a mutation policy using proximal policy optimization. TCRPPO mutates TCRs into effective ones that can recognize given peptides. TCRPPO leverages a reward function that combines the likelihoods of mutated sequences being valid TCRs measured by a new scoring function based on deep autoencoders, with the probabilities of mutated sequences recognizing peptides from a peptide-TCR interaction predictor. We compared TCRPPO with multiple baseline methods and demonstrated that TCRPPO significantly outperforms all the baseline methods to generate positive binding and valid TCRs. These results demonstrate the potential of TCRPPO for both precision immunotherapy and peptide-recognizing TCR motif discovery.