CROWDLAB: Simple and effective algorithms to handle data labeled by multiple annotators
A new open-source module called cleanlab.multiannotator has been developed for measuring the quality of multi-annotator classification data using novel CROWDLAB algorithms. The module can estimate consensus labels, quality scores for each consensus label and annotator, and is more effective than existing solutions on real-world data. It works by forming a probabilistic ensemble prediction considering the labels assigned by each annotator as outputs from other predictors. This approach allows CROWDLAB to still perform effectively even when the classifier is suboptimal or a few of the annotators often give incorrect labels.
Company
Cleanlab
Date published
Oct. 5, 2022
Author(s)
Hui Wen Goh, Ulyana Tkachenko, Jonas Mueller
Word count
1320
Language
English
Hacker News points
2