The Design and Development of a Game to Study Backdoor Poisoning Attacks: The Backdoor Game
Abstract
AI Security researchers have identified a new way crowdsourced data can be intentionally compromised. Backdoor attacks are a process through which an adversary creates a vulnerability in a machine learning model by ?poisoning?' the training set by selectively mislabelling images containing a backdoor object. The model continues to perform well on standard testing data but misclassifies on the inputs that contain the backdoor chosen by the adversary. In this paper, we present the design and development of the Backdoor Game, the first game in which users can interact with different poisoned classifiers and upload their own images containing backdoor objects in an engaging way. We conduct semi-structured interviews with eight different participants who interacted with a first version of the Backdoor Game and deploy the game to Mechanical Turk users (N=68) to demonstrate how users interacted with the backdoor objects. We present results including novel types of interactions that emerged as a result of game play and design recommendations for the improvement of the system. The combined design, development and deployment of our system can help AI Security researchers to study this emerging concept, from determining the effectiveness of different backdoor objects to help compiling a collection of diverse and unique backdoor objects from the public, increasing the safety of future AI systems.