Adversarial Examples are inputs to models that are intentionally crafted to mislead the model’s predictions. These examples are designed to be very similar to legitimate inputs but are specifically engineered to exploit vulnerabilities or imperfections in the model’s decision-making process. Adversarial examples highlight the sensitivity of machine learning models to subtle changes in input data and raise concerns about the robustness and security of these models.


Teaching Syntax by Adversarial Distraction

Existing entailment datasets mainly pose problems which can be answered without attention to grammar or word order. Learning syntax requires comparing examples where different grammar and word order change the desired classification. We introduce several datasets based on synthetic transformations of natural entailment examples in SNLI or FEVER, to teach aspects of grammar and word order. We show that without retraining, popular entailment models are unaware that these syntactic differences change meaning. With retraining, some but not all popular entailment models can learn to compare the syntax properly.