Abstract
The growing availability of single-cell perturbation data called for novel methods to capture treatment response. While early attempts employed autoencoders, neural optimal transport (OT) emerged as a more principled alternative because it inherently accommodates the challenges of unpaired data induced by cell destruction during data acquisition. However, neural OT relied on casting the problem to convex regression which induced practical challenges during training. The re- cently introduced Monge Gap overcomes these challenges through a simple and architecturally agnostic regularizer. While successful, this approach lacks an intrinsic mechanism for generating maps conditional on covariates present in perturbation response studies (e.g., dosage, time, drug, or cell type). Here, we extend the Monge Gap and propose CMonge, an approach that learns Monge maps conditionally on arbitrary context vectors. It is based on a two-step training procedure combining an autoencoder with a Monge map estimator. We show its value for predicting single-cell perturbation responses, conditional to a drug, a drug dosage, or both. We verify that our conditional models achieve comparable results to the condition-specific state-of-the-art and observe that it particularly excels at capturing higher moments of distributions. Importantly, CMonge learns from data aggregated across conditions which exploits cross-task benefits and allows to generalize to unseen conditions with promising performance.