
Introducing world's largest synthetic open-source Text-to-SQL dataset

What's this blog post about?

Gretel has introduced the world's largest synthetic open-source Text-to-SQL dataset, available on Hugging Face under Apache 2.0 license. The gretelai/synthetic_text_to_sql dataset is designed and generated using Gretel Navigator and includes over 105,851 records with diverse SQL tasks and complexity levels. This synthetic data accelerates the transition to data-centric AI by allowing teams to produce high-quality data while preserving privacy and security. The release of this dataset marks a significant milestone in the world of synthetic data and encourages developers, researchers, and data enthusiasts to leverage it for their projects.


Date published
April 4, 2024

Yev Meyer

Word count

Hacker News points
None found.


By Matt Makai. 2021-2024.