DataCebo
SDV Logo

Community

World map

Single Table

Learn a tabular model to synthesize rows in a table

Multi Table

Learn a relational data model to synthesize multiple, related tables

Sequential Table

Learn a sequential or time series model to synthesize new events

The SDV ecosystem

Public, Source-Available Libraries

The SDV is an overall ecosystem for synthetic data models, benchmarks, and metrics. Explore publicly available libraries supporting the SDV. Each can be used as standalone packages for particular needs.

Copulas

Models & generates tabular data with classic statistical methods. Uses multivariate copulas.

downloads

CTGAN

Models & generates tabular data with Deep Learning. Offers CTGAN and TVAE models.

downloads

DeepEcho

Models & generates time series data with a mix of classic statistical models and Deep Learning.

downloads

RDT

Discovers properties & transforms data for data science use. Reverses the transforms to reproduce realistic data.

downloads

Synthetic Data Vault

Generates synthetic data across single table, relational, and time series data. Supports multiple models & evaluations.

downloads

The Synthetic data vault

What can you use synthetic data for?

Use a synthetic data in place of real data for added protection, or use it in addition to your real data as an enhancement.

Test Software

Test Software

Expand Access

Expand Access

Pilot New Products

Pilot New Products

Augment Data

Augment Data

Plan Scenarios

Plan Scenarios

The Synthetic Data Vault in numbers

Visualize
Downloads
SDV logo

    SDV case studies

    Logo

    Synthetic data helps banks detect money laundering without compromising privacy

    Learn more
    Logo

    MAPFRE: improving detection of homeowner insurance fraud by 31 percent with synthetic data

    Learn more

    Quick start

    Try it out now

    Quickly discover SDV with just a few lines of code!

    Install SDV
    from sdv.datasets.demo import download_demo
    from sdv.single_table import GaussianCopulaSynthesizer
    
    real_data, metadata = download_demo(
      'single_table', 'fake_hotel_guests')
    
    synthesizer = GaussianCopulaSynthesizer(metadata)
    synthesizer.fit(real_data)
    
    synthetic_data = synthesizer.sample(num_rows=10)

    Follow us

    Join our Community

    Chat with developers across the world. Stay up-to-date with the latest features, blogs, and news.

    Github

    Github

    Contribute to our open source projects.

    Follow us on Github
    Slack

    Slack

    Ask questions about SDV and get in touch with our development team.

    Join our Slack
    Linkedin

    Linkedin

    Connect with DataCebo on LinkedIn.

    Follow us on Linkedin

    © 2026, DataCebo, Inc.