Company
Date Published
June 12, 2024
Author
Alex Watson
Word count
958
Language
English
Hacker News points
1

Summary

Gretel has released a multilingual synthetic financial dataset on HuggingFace to improve Named Entity Recognition (NER) models and validate PII scanning systems across various industries and languages. The dataset is designed for training NER models, testing PII scanning systems, evaluating de-identification systems, and developing data privacy solutions for the financial industry. It covers 100 distinct financial document formats with 29 distinct PII types in multiple languages.