Follow

Machine Learning Assisted Text Utterance Collection - Graphical Editor

This product is designed to enable our chatbot and conversational AI customers to scale and grow their text utterance collection use cases.  The CML attribute cml:text and cml:textarea can be used for use cases and applications such as transcription and translation.

Note:  Machine Learning Assisted Text Utterance is an add-on feature. Please reach out to your Customer Success Manager if you’re interested in purchasing access.

 

Glossary 

  • Utterance - The piece of text to be collected from the contributor 
  • Coherence - The piece of text that is logical and consistent 
  • Prompt - Data provided to the contributors to give them guidance on what utterances to collect

 

Job Design 

ezgif.com-gif-maker.gif

Fig 1. Machine Learning Assisted Text Validators

 

Smart Validators 

Language Detector 

  • This validator is used to ensure contributors are submitting text in the correct language. Learn more about our Language Detector model here.
  • The currently supported languages are:
    • English

    • German

    • French

    • Spanish

    • Japanese

    • Portuguese

    • Italian

    • Dutch

  • You will need to provide a threshold that will be used to evaluate contributors' submission. The lower the threshold, the more lenient the evaluation will be. 
  • Note: you may only validate for one target language per field.  

Screen_Shot_2022-06-13_at_2.42.01_PM.png

Fig 2. Language Detection Validator

 

Coherence Detector 

  • This validator is used to ensure contributors are submitting text that is cohesive and coherent. The model auto-detects the language they are typing in and then evaluates the probability that what they’re typing is valid text in that language.  Learn more about our Coherence Detector model here.
  • These validators work best on text longer than 10 words. 
  • The currently supported languages are
    • English
    • German
    • French
    • Spanish
    • Japanese
    • Portuguese 
    • Italian
  • Note: You can use this in conjunction with other validators, but you may only set one threshold per individual field.

Screen_Shot_2022-06-13_at_2.42.40_PM.png

Fig 3. Coherence Detection Validator

Duplicate Detection

The following validators give you the option of enforcing only unique submissions of text. This is helpful if you need many diverse examples of the same utterance.

  • In this job, across all contributors
    • If your job is collecting many judgments for per prompt, you'll want to use this option
  • In this job, across all contributors, across a unique prompt value
    • If your job is collecting utterances for multiple prompts, you can use this validator to specify the column that will be validated on. We will enforce unique utterances for each row, but there may be duplicates that apply to multiple rows.
  • In this job, across the unique contributor's submissions
    • This validator ensures contributors do not submit duplicate answers within a job. You can use this if you’d like to get a sense for the variance and frequency of utterances. 
  • In multiple jobs, across all contributors
    • This validator allows you to compare data collected for a completed job. You can use this validator if you want to collect additional unique references.
    • Note: You must input the job IDs and the cml value that match the previous jobs. Best practice is to copy the job with rows, and order additional judgments without changing the CML.


      Screen_Shot_2022-06-13_at_2.44.16_PM.png

Fig 4. Duplicate Detection Validator

 

For more information on the Model, check out this article.


Was this article helpful?
1 out of 1 found this helpful


Have more questions? Submit a request
Powered by Zendesk