There are four dashboards located in the DASHBOARD tab as well as the downloadable reports tab.
The Productivity Dashboard contains two sections. Progress and Throughput.
The Progress dashboard tracks the progress of units per job. You can select which Jobs to display after filtering by Job type.
Not Started - available to be fetched by a contributor
Working - currently checked out with a contributor
Submitted - submitted
Resolving - going through Feedback arbitration
The Throughput dashboard displays the overall work rate, number of contributors who have worked on data, and the number of contributor hours (based on interaction with pages of work in jobs).
- Filterable by Job Type, date range and time slot.
- Work rate is always calculated as units completed per hour.
The Quality dashboard provides metrics on QA results. In the following example, Allow QA checker to Modify was selected (see this article). This job has a 66% acceptance rate overall. For audio transcription jobs additional metrics such as Word Error Rate (WER), Tag Error Rate (TER) etc. are available. Detailed metrics are also available for form tools (radios & checkboxes) and text; for other job types these metrics are currently under development.
The Quality Dashboard is dynamic, depending on your Quality Settings, for example if you chose Accept/Reject and included reasons for rejections, these reasons will appear on the dashboard, along with their frequency, as in the following example.
For jobs using form tools such as checkboxes and radio buttons the quality dashboard can provide an overall unit acceptance rate, along with question level accuracy (as long as you have allowed the QA checker to modify the original judgments).
If you would prefer to leave some questions out of the overall you can do so via the Question Settings button located at the top left of the Dashboard
A list of your questions will appear and you can deselect any that you do not want to include in the overall question accuracy:
After switching Jobs (or deselecting and reselecting your Job again) new question level accuracy will be available and the overall accuracy will also be recalculated. Note the Acceptance rate is not changed.
The contributors dashboard provides quality scores and productivity throughput information, displayed by contributor.
This Dashboard also includes a panel for displaying comparative QA checker behaviour.
When the Feedback loop is enabled, this page shows the results and is also where the Project Owner will go to arbitrate any disputed feedback.
If there is feedback to resolve there will be a "Details" link in the Action column of the table. The Supervisor can click on this link to be shown the original result, the QA checker's modifications, and/or feedback, and the original contributor's response.
The Project Owner then can choose whether to accept the work done by the original contributor "Revert to Original Work", or accept the QA checker's version "Approve QA Modification". The version chosen will be reflected in the downloaded results. Note that currently this has no retroactive effect on the job's or the contributor's acceptance rate.
Feedback loop is typically turned on at the beginning of a project as an aid to on the job training, it can be turned off later in the project, once the contributors' and QA checkers' behaviours become stable.
Customizable Benchmark Dashboard
On the top right corner of the Dashboard page you will see a button to Customize Dashboard. This button allows you to set up a Dashboard specifically designed to display LLM Benchmarking results. To use this Dashboard you will need Model IDs and jobs created with cml:rating questions that elicit human preference data. Talk to your CSM if you would like to find out more.
Clicking on this button will open a modal to confirm. Tick the Benchmark checkbox and Click "Confirm".
A new BENCHMARK tab will appear at the top left of the Dashboard tabs, before PRODUCTIVITY.
To set up the Dashboard you will be asked to enter some information about the dimensions you wish to see displayed.
- Jobs - Select which jobs to include in the results.
- Model - Select which field in your data set to retrieve the Model names or IDs from. Note that depending on your job set up the Models may come from your Source data (if you collected responses offline), or else they may come from your Judgment data (if you collected live responses).
- Questions - Select which Ratings questions you'd like to include in your results. The dashboard will display the average ratings per question.
- Metadata - Select any other columns that might contain dimensions you are interested in displaying. As with Models these dimensions may have been included in your Source data or you may have collected them as part of the Judgment process.
- Demographics - Select demographic dimensions from the Curated Contributors you are interested in including as part of the analysis (Note: if you are not using Curated Contributors you can also include Demographics collection as part of your Judgment, in which case this information will be in Metadata instead).
Click Save and you will be taken to your configured Dashboard. Your configuration choices will be displayed at the top of the screen. These can be filtered and also reconfigured at any time using the Manage Criteria button.
The Dashboard provides an Overall view for each question and displays drill downs into various dimensions of interest.
The overall graph is interactive, click on a question/model result of interest to see details of that score.