An Awesome Synthetic Multilingual Prompts Dataset
Gretel has released a comprehensive "Synthetic Multilingual LLM Prompts" dataset, featuring 1,250 synthetic prompts in seven languages. The dataset is designed for use with conversational LLMs like ChatGPT and is available on GitHub and Hugging Face. Translation quality was assessed using the LLM-as-a-Judge method, ensuring accuracy, fluency, and consistency across languages. This dataset is released under the Apache 2.0 license and can be used with proper attribution.
Company
Gretel.ai
Date published
July 3, 2024
Author(s)
Maarten Van Segbroeck
Word count
652
Language
English
Hacker News points
None found.