Overview
AI Chat Feedback jobs allow you to monitor and evaluate one or more of your LLMs through live conversations with domain experts. AI Chat Feedback enables contributors to interact with and give feedback on multiple model outputs based off real-time responses.
Note:
To set up an AI Chat Feedback job, you will first need to have LLM enabled on your account and be able to configure models. More information can be found in this article.
Building a Job
Once you have models available, you can use the CML tag <cml:ai_chat_feedback/>
to set up your job, as in the cml sample below.
<cml:ai_chat_feedback label="Provide chat feedback" name="annotation" source-data="{{source_data}}" validates="required" />
Parameters
This tag supports 5 parameters:
-
source-data
(required)- This is the Model ID, or column name from your source data that contains the model ID. This can be an array of Model IDs if multiple models are being used: MODEL_ID or [MODEL_ID, MODEL_ID].
-
review-data
(optional)- This is the column name from your source data that contains existing chat history output by the tool. Refer to the "Results" section to see the format.
-
seed
(optional)- This is the column name from your source data that contains a pre-written prompt. When the job is launched, the contributor will be able to see a pre-written message in the USER input line. They will not be able to edit this message, they will only be able to submit it along with the response.
-
allow-continue
(optional, defaults to "false")- This attribute allows you to control whether or not a contributor can send new messages to the chatbot on a QA job. Contributors are able to go through previous rounds of contributor/chatbot interaction and give feedback on each individual response based on the set ontology.
-
show-feedback-answers
(optional, defaults to "true")- This attribute allows you to control whether or not contributors are able to see previous contributor model response selection or ontology responses within a review job.
The source-data
parameter is the only required tool specific value you need to provide. This will allow you to collect the live chat history between the contributor and the model.
The next image shows an example where review-data
and seed
parameters have been used. The greyed-out USER/RESPONSE text reflects the review-data
, while the pre-written prompt in the input field "What should I pack?" reflects the seed
data.
Model Response Selection
If you have set up multiple models within the tool, contributors will be presented with responses from all the configured models and they must choose which response they wish to continue the conversation with, by selecting the radio button located to the left of the response. Once they have chosen, and provided any required feedback, they will be able to continue with their own question or continuation.
If the contributor is not sure which response to select, they also have the option to click the "Not sure, randomly choose response" button. The tool will randomly choose a response for them to continue the conversation with.
Response Level Feedback
If you would like the contributor to provide feedback to individual model responses, you will need to also configure an ontology question.
Step 1: Go to "Manage AI Chat Feedback Ontology".
Step 2: Configure your ontology question(s).
After configuring ontology questions, the tool will allow contributors to provide feedback for each response.
Contributors will not be able to submit their judgment until all required questions have been answered.
Upload Data
Upload data into the job as a CSV where each row represents a conversation to be collected. Your .csv must contain at least one column containing the Model ID.
Results
The results from an <cml:ai_chat_feedback/>
includes the entire chat history, both model and user messages, model ID used, and any answers provided by the user, if questions were configured, AND the model response selection, if multiple models were configured. For example:
{
"ableToAnnotate": true,
"annotation": {
"chat": {
"0": {
"id": "0",
"prompt": "hi there",
"completion": [{
"modelId": 18,
"completion": "Hello! How can I assist you today?",
"selection": true
}, {
"modelId": 19,
"completion": "Hello there! How can I assist you today?",
"selection": false
}],
"feedback": {
"18": {
"questionId": "d8ba7201-ee5b-4ef8-8e5c-894711b48e2b",
"type": "Multiple Choice",
"name": "is_this_response_factually_correct",
"answer": {"values": "yes"}
}]
},
"19": {
"questionId": "d8ba7201-ee5b-4ef8-8e5c-894711b48e2b",
"type": "Multiple Choice",
"name": "is_this_response_factually_correct",
"answer": {"values": "no"}
}]
}
},
"enableFeedback": true
}
"model": "[19,18]"
}
}