View all topics

Adversarial Robustness and Privacy

Even advanced AI systems can be vulnerable to adversarial attacks. We’re making tools to protect AI and certify its robustness, including quantifying the vulnerability of neural networks and designing new attacks to make better defenses. And we’re helping AI systems adhere to privacy requirements.

Our work

IBM further strengthens Granite for enterprise deployment with HackerOne
News
Mike Murphy
27 Aug 2025
An invisible watermark to keep tabs on tabular data
Research
Kim Martineau
19 May 2025
What is red teaming for generative AI?
Explainer
Kim Martineau
11 Apr 2024
An open-source toolkit for debugging AI models of all data types
Technical note
Kevin Eykholt and Taesung Lee
08 Sep 2023
Did an AI write that? If so, which one? Introducing the new field of AI forensics
Explainer
Kim Martineau
24 Jul 2023
Manipulating stock prices with an adversarial tweet
Research
Kim Martineau
13 Jul 2022
- Adversarial Robustness and Privacy
- Trustworthy AI
See more of our work on Adversarial Robustness and Privacy

Publications

Phrase-grounded Fact-checking for Automatically Generated Chest X-ray Reports
- - Razi Mahmood
  - Diego Machado Reyes
  - et al.
- 2025
- MICCAI 2025
Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents
- - Ivoline Ngong
  - Swanand Ravindra Kadhe
  - et al.
- 2025
- ACL 2025
In-Context Bias Propagation in LLM-Based Tabular Data Generation
- - Pol Garcia Recasens
  - Alberto Gutierrez-torre
  - et al.
- 2025
- ICML 2025
A Unified Framework for Generative AI Safety
- - Pin-Yu Chen
- 2025
- ICML 2025
MAD-MAX: Modular And Diverse Malicious AttackMiXtures for Automated LLM Red Teaming
- - Stefan Schoepf
  - Muhammad Zaid Hameed
  - et al.
- 2025
- ICML 2025
PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection
- - Wei Li
  - Pin-Yu Chen
  - et al.
- 2025
- CVPR 2025

View all publications