Post-processing Private Synthetic Data for Improving Utility on Selected Measures
- 2023
- NeurIPS 2023
Companies are struggling to unlock the value of their sensitive data due to several challenges, including:
Synderella solves these challenges — it's a no-code platform for creating realistic synthetic versions of sensitive tabular datasets that preserve the trends, signals, and relationships from real data. This data can be used to build and train software or predictive AI models. However, unlike real datasets, this synthetic data contains no customer information and is therefore not subject to the same regulatory or ethical considerations.
Synderella relies on generative AI models to first learn a representation of a sensitive dataset and then to generate large volumes of new, fake data that behave according to that learned representation.
To ensure no sensitive information is leaked from the real data to the synthetic data, the platform leverages the mathematical concept of “differential privacy,” which adds noise to the synthetic data to obscure the presence of rare individuals in the underlying training dataset.
Synderella can be run wherever sensitive data is stored, whether on a private cloud or on-premises.