DAMAGeR: Deploying Automatic and Manual Approaches to GenAI Red-teaming

Manish Nagireddy; Michael Feffer; Ioana Baldini Soares

AAAI 2025

Tutorial

25 Feb 2025

DAMAGeR: Deploying Automatic and Manual Approaches to GenAI Red-teaming

Abstract

Over the past couple of years, GenAI models with billions of parameters have become readily available to the general public. In turn, a mixture of tangible results and hype has led to eagerness from the public to use GenAI in many different ways. At the same time, there are various concerns surrounding these models, leading to burgeoning efforts to document and classify their negative impacts. Red-teaming, which typically takes the form of interactive probing, is commonly used as part of these efforts. In order to most effectively uncover potential risks via red-teaming, we strongly believe that a participatory effort is paramount. In particular, with this lab, we seek to leverage the diverse set of skills and background experiences of AAAI conference attendees in order to discover GenAI failures. By providing attendees with varying familiarity with GenAI models and issues with an opportunity to actively red-team generative AI models, we hope to affirm the notion that effective red-teaming requires broad participation and effort. We are confident that our lab will encourage attendees to continue to provide invaluable feedback on the failure modes of these pervasive GenAI models.

Workshop paper