Follow

Guide to: Running an AI chat feedback job

Overview

AI Chat Feedback jobs allow you to monitor and evaluate one or more of your LLMs through live conversations with domain experts.  AI Chat Feedback enables contributors to interact with, and give feedback on multiple model outputs, based off real-time responses.

 

Screenshot 2023-08-28 at 11.11.15 AM.png

Note:

To set up an AI Chat Feedback job, you will first need to have LLM enabled on your account and be able to configure models. More information can be found in this article.

Building a Job

Once you have models available you can set up your job using the code editor or the graphical editor.

The Code Editor

The CML tag for this tool is <cml:ai_chat_feedback/>, as in the sample below.

<cml:ai_chat_feedback label="Provide chat feedback" name="annotation" source-data="{{source_data}}" validates="required" />

The Graphical Editor

Step 1: click on the AI Chat Feedback icon in the graphical editor:

Screenshot 2023-11-17 at 2.09.15 PM.png

 

Step 2: Select the column from your source data that contains the IDs assigned to your model(s). Note that the source data is an obligatory parameter for this tool; you will need to upload your data and choose the source column before you can continue designing your job.

Step 3: Configure additional settings such as enhanced responses, model response rewrite and/or disable pasting (see below for more information on these options) and click Save & Close.

Step 4: Configure any Smart Validators such as Regex and/or Spelling & Grammar (see this article for more on Smart Validators)

 

 

Step 5: Click on "Manage Language Models" to enable the models in your Job.

Screenshot 2023-11-17 at 2.16.01 PM.png

Step 5: Select the models you'll be using in your Job and click "Save".

Screenshot 2023-11-17 at 2.16.09 PM.png

Parameters

This tag supports four parameters in standard single-model jobs; source-data is the only required  parameter. There are three additional parameters for advanced use cases, described below. 

  • source-data (required)
    • This is the Model ID that was assigned to the model when you configured it for your Team (see this article), or the column name from your source data that contains the model ID. This can be an array of Model IDs if multiple models are being used: e.g. MODEL_ID or [MODEL_ID, MODEL_ID]. 

  • preamble (optional)
    • This is information that your model(s) will take in to prompt the conversation going forward. Contributors will not be able to give feedback or answer any ontology questions directly related to the preamble.

    • For the preamble to function, your model needs to have an endpoint capable of ingesting a preamble/context/more information. If your model supports this input type, navigate to the Models configuration in your Team Account  and incorporate something like the following into your input schema (see this article), replacing "preamble_override" with the model-specific parameter expected by the model endpoint.

"preamble_override": "${dynamic_attrs.preamble}"
  • live-preamble (optional)
    • This parameter allows contributors to create their own preamble or context (contributors will not be able to give feedback or answer ontology questions related to the live-preamble). When a contributor loads a task, they will see a pop-up prompting them to supply context or preamble for the model. This context is fed to the model each time a message is sent to it.
    • Syntax:  live-preamble='[true]' , as in the screenshot below, where the default instruction for contributors is "Please provide the model with the necessary context for the conversation." OR live-preamble="[true,'Your text here']" , where you can customize your instruction to the contributors. 

  • review-data (optional)
    • This is the column name from your source data that contains any existing chat history output by the tool. Refer to the "Results" section to see the format.
    • You can add review-data by utilizing the column selector in the graphical editor, or the cml attribute.

 

      • The purpose of review-data is to provide contributors (and the model) with any previous turns so that the contributor and model can carry on the interaction. This is an example of what a job with review-data looks like:

single-model-review-data.png

  • seed (optional)
    • This is the column name from your source data that contains a pre-written prompt. When the job is launched, the contributor will be able to see a pre-written message in the box where they would normally type their prompt. They will not be able to edit this message, they will only be able to send it and then proceed to evaluating the response(s).

The next image shows an example where review-data and seed parameters have been used. The greyed-out USER/RESPONSE text reflects the review-data, while the pre-written prompt in the input field "What should I pack?" reflects the seed data.

eb6e77f2-7ee2-41cc-8bb1-66cbaf40de57.png

  • min-turns (optional, defaults to 0)
      • This attribute allows you to control the minimum number of turns a contributor must complete before submitting their judgment. If the contributor fails to meet the specified number of turns, they will receive an error message (as in the screenshot below) and their submission will be blocked.
      • Syntax: min-turns="3"

    • max-turns (optional)
      • This attribute allows you to control the maximum number of turns a contributor must complete before submitting their judgment. Once the contributor meets the specified number of turns, they will be prompted to submit their work (as in the screenshot below) and any further submission will be blocked.
      • Syntax: max-turns="5"

  • rich (optional, defaults to "false" )
    • This attribute enables contributors to integrate rich text in the main tool input box. Custom response and live preamble always includes rich text, this attribute allows you the added opportunity to choose the complexity for the main input box.

In QA or review jobs you can optionally also use the following parameters:

  • allow-continue (optional, defaults to "false")
    •  In QA jobs, contributors are able to access previous rounds of contributor/chatbot interaction and give feedback on each individual response based on the set ontology. This attribute allows you to control whether or not the QA contributor can also send new messages to the chatbot.
  • show-feedback-answers (optional, defaults to "true")
    • This attribute allows you to control whether or not QA contributors are able to see previous contributor model response selection or ontology responses.

Finally, when using multiple models (see Model Response Selection, below) there are two additional parameters:

  • enhanced (optional, defaults to "false")
    • In use cases where contributors are asked to choose between responses from multiple models, this attribute allows them to provide additional detail about their choice.
    • Enable through the cml or through the graphical editor, by checking "Enhanced Responses" under Custom Configurations.
  • custom-response (optional, defaults to "false")
    • This attribute can be enabled when enhanced="true"
    • Enables contributors to provide their own response to the prompt in cases where the model response(s) are not optimal. This is so that contributors do not have to accept mediocre or subpar replies.
    • Where the contributor provides a custom response, this is the response that will be used to continue the interaction.
    • Enable through cml or through the graphical editor, by checking "Model Response Rewrite" under Custom Configurations.

Response Level Feedback

If you would like contributors to provide feedback on individual model responses, you can configure ontology questions.

Step 1: Go to "Manage AI Chat Feedback Ontology".

bcb6b934-cfb7-4a37-9422-54a037a7206d.png

 

Step 2: Configure your ontology question(s). 

2f64d27a-d1d5-47dd-8ae5-500f6e6f93a0.png

Once an ontology is configured, the tool will require contributors to provide feedback for each response.

Contributors will not be able to submit their judgment until all required questions have been answered.

c34f5ba6-49c2-41c3-9049-858eaabd6d19.png

 

Model Response Selection

If you have set up multiple models within the tool, contributors will be presented with responses from all the configured models and they must choose which response they wish to continue the conversation with (after answering any ontology questions for each response), by selecting the radio button located to the left of the response. 

If the responses are very similar and the contributor can't choose between them they can select the "Responses are near identical" button. The tool will randomly choose a response for them to continue the conversation with.

multi-basic.png

Once the contributors have chosen the response they prefer, if enhanced="true" they will then be asked an additional question about how the responses compare.

MicrosoftTeams-image (10).png

 

 

Upload Data

Upload data into the job as a CSV where each row represents a conversation to be collected. Your .csv must contain at least one column containing the Model ID.

Results

The results from an <cml:ai_chat_feedback/> includes the entire chat history, both model and user messages, model ID used, and any answers provided by the user, if questions were configured, AND the model response selection, if multiple models were configured. If enhanced="true" there will also be a "confidence" line. For example:

{
"ableToAnnotate": true,
"annotation": {
"chat": {
"0": {
"id": "0",
"prompt": "hi there",
"completion": [{
"modelId": 18,
"completion": "Hello! How can I assist you today?",
"selection": true
"confidence": "much-better"
}, {
"modelId": 19,
"completion": "Hello there! How can I assist you today?",
"selection": false
}],
"feedback": {
"18": {
"questionId": "d8ba7201-ee5b-4ef8-8e5c-894711b48e2b",
"type": "Multiple Choice",
"name": "is_this_response_factually_correct",
"answer": {"values": "yes"}
}]
},
"19": {
"questionId": "d8ba7201-ee5b-4ef8-8e5c-894711b48e2b",
"type": "Multiple Choice",
"name": "is_this_response_factually_correct",
"answer": {"values": "no"}
}]
}
},
"enableFeedback": true
}
"model": "[19,18]"
}
}

Was this article helpful?
5 out of 5 found this helpful


Have more questions? Submit a request
Powered by Zendesk