Research to enable enterprises build their own generative models
We conduct pioneering AI research to power our product, enabling enterprises to build their own generative models- unlocking new automation, intelligence, and transformation across all data workflows.

CENTS: Generating synthetic electricity consumption time series for rare and unseen scenarios
Research that generates high-fidelity synthetic household electricity data, enabling realistic and scalable simulations for modern power-grid planning.
Synthetic data assessment based on model improvement
A rigorous set of frameworks to assess Synthetic data.
Sequential Models in the Synthetic Data Vault
In this paper we present the core algorithm behind the sequential models within the Synthetic Data Vault
Synthesizing Tabular Data using Conditional GAN
This MIT thesis by Lei Xu presents the now well-known CTGAN model and benchmarks it against multiple state-of-the-art tabular models.
Learning Vine Copula Models For Synthetic Data Generation
Vine copulas are a type of generative model that can effectively capture hierarchical relationships within tabular data. This paper introduces a breakthrough reinforcement-learning approach for optimizing vine structures
Modeling Tabular data using Conditional GAN (CTGAN)
This NeurIPS 2019 paper introduces our CTGAN model, which has since become a widely adopted benchmark in the field.
SDV: An Open Source Library for Synthetic Data Generation
This thesis outlines the core software abstractions and architectural principles required to construct generative models over database systems. It introduces the concept of reversible data transforms, a key innovation that makes such modeling feasible
Synthesizing Tabular Data using Generative Adversarial Networks (TGAN)
Our first Tabular GAN model that used sequential long-short term memory (LSTM) network to model tabular data,
The Synthetic Data Vault: Generative Modeling for Relational Databases.
This MIT thesis by Neha Patki developed the first generative modeling technique capable of modeling entire relational databases.
The synthetic data vault
This is the first paper to introduce generative models for databases (The Synthetic Data Vault) and to demonstrate, through rigorous evaluation, that data scientists show no statistically significant performance differences when working with synthetic versus real data.
Copula Graphical Models for Wind Resource Estimation
Research demonstrating how synthetic data can help determine whether a site has adequate wind speeds to justify wind farm development.