Research

Research to enable enterprises build their own generative models

We conduct pioneering AI research to power our product, enabling enterprises to build their own generative models- unlocking new automation, intelligence, and transformation across all data workflows.

Blue graphic
Research2025

CENTS: Generating synthetic electricity consumption time series for rare and unseen scenarios

Research that generates high-fidelity synthetic household electricity data, enabling realistic and scalable simulations for modern power-grid planning.

Alfredo Cuesta-Infante Kalyan Veeramachaneni Michael Fuest
Thesis2022

Synthetic data assessment based on model improvement

A rigorous set of frameworks to assess Synthetic data.

Romain Palazzo
Paper2022

Sequential Models in the Synthetic Data Vault

In this paper we present the core algorithm behind the sequential models within the Synthetic Data Vault

Neha Patki Kalyan Veeramachaneni
Thesis2020

Synthesizing Tabular Data using Conditional GAN

This MIT thesis by Lei Xu presents the now well-known CTGAN model and benchmarks it against multiple state-of-the-art tabular models.

Lei Xu
Paper2019

Learning Vine Copula Models For Synthetic Data Generation

Vine copulas are a type of generative model that can effectively capture hierarchical relationships within tabular data. This paper introduces a breakthrough reinforcement-learning approach for optimizing vine structures

Yi Sun Alfredo Cuesta-Infante Kalyan Veeramachaneni
Paper2019

Modeling Tabular data using Conditional GAN (CTGAN)

This NeurIPS 2019 paper introduces our CTGAN model, which has since become a widely adopted benchmark in the field.

Lei Xu Kalyan Veeramachaneni
Thesis2018

SDV: An Open Source Library for Synthetic Data Generation

This thesis outlines the core software abstractions and architectural principles required to construct generative models over database systems. It introduces the concept of reversible data transforms, a key innovation that makes such modeling feasible

Andrew Montanez
Research2018

Synthesizing Tabular Data using Generative Adversarial Networks (TGAN)

Our first Tabular GAN model that used sequential long-short term memory (LSTM) network to model tabular data,

Lei Xu Kalyan Veeramachaneni
Thesis2016

The Synthetic Data Vault: Generative Modeling for Relational Databases.

This MIT thesis by Neha Patki developed the first generative modeling technique capable of modeling entire relational databases.

Neha Patki
Paper2016

The synthetic data vault

This is the first paper to introduce generative models for databases (The Synthetic Data Vault) and to demonstrate, through rigorous evaluation, that data scientists show no statistically significant performance differences when working with synthetic versus real data.

Neha Patki Kalyan Veeramachaneni Roy Wedge
Paper2015

Copula Graphical Models for Wind Resource Estimation

Research demonstrating how synthetic data can help determine whether a site has adequate wind speeds to justify wind farm development.

Alfredo Cuesta-Infante Kalyan Veeramachaneni

Let's put synthetic data to work

Get started with SDV Community

© 2026, DataCebo, Inc.