Overview
This article lays out the details of how aggregation works for a Text Annotation job.
š” Note: Aggregation is currently NOT supported in the Non-Tokenized version of the tool.
Aggregation="tagg"
This is an aggregation method made specially for Text Annotation jobs. This method returns a link to a JSON that describes the text, tokens, spans, and each labeled span will get an inter-annotator agreement score titled "confidence".
The āconfidenceā score is calculated by dividing the sum of trust scores of contributors who annotated a particular span by the total number of contributors who worked on that row.
Example
- Contributor 1 has a trust of 0.95 and selected token āAppleā with class āBrandā
- Contributor 2 has a trust of 0.92 and selected token āAppleā with class āBrandā
- Contributor 3 has a trust of 0.82 and selected token āAppleā with class āBrandā
- Contributor 4 has a trust of 0.91 and selected token āAppleā with class āFruitā
The aggregated result for the token āAppleā would be theĀ class āBrandā. The confidence score for this span would be:
(0.95+0.92+0.82) / 4 = 0.6725
Important Notes:
- Spans with attribute
āannotated_by = āmachineāā
are not taken into the equation. - In the scenario where test questions are not used, each contributor working in the job would have a trust score of 1.