Biography
We are the proud creators of the Synthetic Data Vault (SDV), the largest public ecosystem for synthetic data generation & evaluation.
Recent Posts

Launching Differentially Private Synthesizers in SDV
With this new functionality users can train differentially private synthesizers within the SDV platform

Synthetic Data in 2024: The Year In Review
Check out how 2024 has been the biggest year for synthetic data. Google, Apple, Meta, OpenAI all emphasized the importance of using synthetic data in their AI model development. While Snowflake, Databricks, DataCebo and several others released new tooling required to create synthetic data.
Boosting Fraud-Detection Accuracy with Synthetic Data
Using a model from the Synthetic Data Vault (SDV), a UCLA team has shown that credit card fraud-detection can be dramatically improved by generating synthetic case data consistent with past examples of fraud. They show that they can reduce the false negatives by a factor of 20x.

Why we changed the SDV license to BSL (and how that impacts our users)
The Business Source License allows the SDV team to continue innovating while also enabling many types of usage.

Announcing SDV 1.0: Towards programmable synthetic data stack
The SDV 1.0 library has formalized key principles for generative AI. See what's new and enrich your synthetic data project.

The SDV in 2022: Never a Dull Moment
A 2022 year-end review of the SDV and what you can expect in 2023.
Blog Authors
The DataCebo Blog is a collaborative effort by the team.
Would you like to use synthetic data to solve your business needs? Contact us.