Follow

Annotation Tools Job Report

For tools such as ai_chat_feedback, smart_text, text_annotation,audio_tx and shapes, results are provided, using CDS, as a secure link to a JSON file containing the annotations. See this article if secure data access (SDA) has also been set up for your team.

By default, in the download report, the name of the column containing the output of each annotation tool, would be named "annotation" (with the exception of audio_tx, where it is named "audio transcription"). However, this name is customizable, and may be edited by the job's designer. The output column name will correspond to the value of the name attribute for each tool, which is visible and customizable, in the code editor on the design page of the job. 

<cml:ai_chat_feedback label="Provide chat feedback" name="annotation" source-data="{{source_data}}" validates="required" />

If you are building jobs with multiple tools or multiple instances of the same tool using the graphical editor or by adding cml snippets in the code editor, by default "annotation" will be appended with a number, e.g. "annotation_1, annotation_2" etc. for each instance of a tool. If you are using the tool inside a Quality Flow project, the output column names for the most recent version of the output will be prepended with "latest.judgment." in the dataset report.

The output columns contain json objects containing the URLs as well as a few metadata fields which are tool specific, see each tool's guide page for details. Below is an example of the shapes output that contains the metadata object with the key number_of_shapes which corresponds to the total number of shape objects annotated with the tool for that row.

{
type: "shapes",
valueRef:
"jobs/2129786/annotations/040ef579-b460-41bf-938c-8b20f9111f77.json",
metadata: { number_of_shapes: 4 },
url: "https://requestor-proxy.appen.com/v1/redeem_token?token=+5ipEHeSkd4NNl0tVZoXVj820j+JExjuUiH7uc7nv8UNltxjGvv+oOfU+gNOKAUr9BOjvuR1Cmfw9xqTiYTrbRRRkQkR8p1I3YV7Body3f9Bj9DeF4SX2busY9F1CeswmyYbM/J6cyF5hddhAMJeZtLF0Uy+yDzmOxPgpWpkvWDo6d8wjg71543BZmu/KqDuQ9AaEb5VYUn8xOh+XJBfNWyfcmY+gafDi7nO5dzY9DBEjIEjM7knt1FSAERPbEvAxUevP1Z9T3egRnp6OJytX9rO/8oty8JCAeH/NTXmDzFiEJTJ5Agq8oX5wSq6YaM1gB4AieWPDkSmdiVK+w==--lfj3HKqXERhEosWl--J+rdIzb4LRCXUbqeQFzSAw==&version=1025",
urlExpiresAt: "2023-04-24T19:40:08Z",
} 

 

URL Expiry

The job result links will expire 15 days after generation (the json output includes the urlExpiresAt key to let you know when the link will expire). If your links expire, you can regenerate and download the job report again, and this will update the links. You can use the developer docs to regenerate the report and download the report. Alternatively you can use the UI to regenerate and download the report. 

Authentication to Access URLs

URLs in the report require authentication. 

See example below if you have scripts that operate on the output URLs: 

To call - replace your requestor-proxy hostname to api-gateway: 

https://requestor-proxy.appen.com/v1/redeem_token?token=gSz/nU4ULsQjpetkA5l5V1ZXgt04exB
EVr2DsgiqIsyvAd1QvnkiG9MQVhQXgGZmEpSS1E0EJd17QB4DNju1w8js45FOZq2xiV8zPdmKkgdjphrDRl3E8
topXMp0PydXUkV32zre1HJnp4dN0M8ZyDu5V+hlVJ2wbuz83qh3+raRWU510imK53HVn/RTlplqld4lDdlWeR8a
Mljf/88RQpvuFhJk6ckixhZclOSi21Iq5POEn1xeJKcxzjY4TzCpQvFQYEGsKS4VEq+J9eVUxvDRUHoZxDRwv02
Nc9Jg/ggbXEH7sxXPnA6zf3RVYlb8ZQ==--s5tHd/9d1AG1dASD--FtLuDsqPS8pa2CH7/p66Bg==&version=
285
 

Once that is complete, add the api key at the end: 

https://api-beta.appen.com/v1/redeem_token?token=gSz/nU4ULsQjpetkA5l5V1ZXgt04exBEVr2Ds
giqIsyvAd1QvnkiG9MQVhQXgGZmEpSS1E0EJd17QB4DNju1w8js45FOZq2xiV8zPdmKkgdjphrDRl3E8topXMp
0PydXUkV32zre1HJnp4dN0M8ZyDu5V+hlVJ2wbuz83qh3+raRWU510imK53HVn/RTlplqld4lDdlWeR8aMljf/8
8RQpvuFhJk6ckixhZclOSi21Iq5POEn1xeJKcxzjY4TzCpQvFQYEGsKS4VEq+J9eVUxvDRUHoZxDRwv02Nc9Jg/
ggbXEH7sxXPnA6zf3RVYlb8ZQ==--s5tHd/9d1AG1dASD--FtLuDsqPS8pa2CH7/p66Bg==&version=285&key=
YOUR_API_KEY
 

Using ref_to_url to translate CDS refs into URLs 

If units are stored in CDS but job design requires a url (for example, a second job needs to reference the file for quality assurance or to collection additional annotations), you can use a liquid filter to translate the CDS reference into a URL. The structure {{ data_column | ref_to_url }} will turn a CDS ref from data_column into a short-lived S3 link. In addition, you can use the /link endpoint in CDS to generate short-lived S3 links. 

Example: 

Unit data 

script,audio_file 
"Hey Siri, what time is it?", ""{ "type":"file_upload",
"valueRef":"/jobs/123456/annotations/00112233-4455-6677-8899-aabbccddeeff00.wav"
}""
 

CML code 

<audio src="{{ audio_file | ref_to_url }}" /> 

HTML on the assignment page 

<audio src="https://s3.amazon.com/cds/teams/00112233-4455-6677-8899-aabbccddeeff00/
jobs/123456/annotations/00112233-4455-6677-8899-aabbccddeeff00.wav?sig=xyz" />

For example, in Job 1 you can incorporate the file upload tool on its own, or in combination with other cml elements, to allow contributors to upload a file, in this case an image, named "agent_image":

 <cml:file_upload allowed-extensions="['JPEG','PNG', 'JPG']" min-size="0.01" max-size="5" name="agent_image" label="Upload an Image:" validates="required"/> 

The second job can then refer to this image by calling the value of the "agent_image" in combination with "ref_to_url":

<img src="{{agent_image['value'] | ref_to_url}}" width="100%"/>

This will display the uploaded image, and you can add radio, text or checkbox elements to collect additional information or have contributors perform rating tasks.

Regenerating Expired Result Links

As mentioned above, job result urls for Annotation Tools expire 15 days after generation. Below is an alternative way to regenerate the expired links.

  1. Retrieve the valueRef from the expired JSON result and the team ID of the job the results came from. You can get your team ID by clicking “Team Jobs” in your account and copying the id in the URL after https://client.appen.com/jobs?scope=

  2. Retrieve your account api key

  3. Use 3600 for expiration_ms as this is how long the new token is valid for

  4. Create the following link:

    1. https://api-beta.appen.com/v1/request_token?team_id

    2. team_id+

    3. "&path="+

    4. value_ref+

    5. "&expires_in="+

    6. expiration_ms +

    7. "&key=" + api_key

    8. This is the new link

  5. Ping this new link to get a new Token

  6. new_link = "https://api-beta.appen.com/v1/redeem_token?" + new_token

Python:

def regenerate_url(expired_result, team_id, expiration_ms=3600”):

link = expired_result['url']

value_ref = expired_result['valueRef']

end_value = team_id+"&path="+value_ref+ "&expires_in="+expiration_ms

anno = "https://api-beta.appen.com/v1/request_token?team_id="+end_value

anno = anno + "&key=" + api_key

response = requests.get(anno)

new_token = response.text

new_link = "https://api-beta.appen.com/v1/redeem_token?" + new_token

return(new_link)

Alternative with curl:

curl -H "Authorization: Token token=MYTOKEN" “https://api-beta.appen.com/v1/request_token?team_id=MYTEAMID&path={valuerRef}&expires_in=3600”

Output:

token=JIA8XvqULSmgqR+pmABQ7hnWUO31u2FdLd5hXuT/BysnVFQfZAXOAq1vUMjUoQXzUFwtelMtMnH/SYMLgC0pR/yyf5Ti7ysBrH2sMpkZCt1U7ZZ2sJllrykjwA4LiQcH0zJq+Cki0I0sZpyneje5E/HhjYLdbN2P41PW0N/UxKW5XH9kKdU51bhlbfVhV3rCiS09JmKgMfMShOsxrYos3a3hk/U+dUEU7B+iejmAMHOPudZ6yw==--XWCIICf3jSKcTu47--tpT8DMuAagAnkn2/cyRzgQ==&version=590

Then you can pass that output to redeem_token endpoint to get the file:

curl -H "Authorization: Token token=MYTOKEN" “https://api-beta.appen.com/v1/redeem_token?token=JIA8XvqULSmgqR+pmABQ7hnWUO31u2FdLd5hXuT/BysnVFQfZAXOAq1vUMjUoQXzUFwtelMtMnH/SYMLgC0pR/yyf5Ti7ysBrH2sMpkZCt1U7ZZ2sJllrykjwA4LiQcH0zJq+Cki0I0sZpyneje5E/HhjYLdbN2P41PW0N/UxKW5XH9kKdU51bhlbfVhV3rCiS09JmKgMfMShOsxrYos3a3hk/U+dUEU7B+iejmAMHOPudZ6yw==--XWCIICf3jSKcTu47--tpT8DMuAagAnkn2/cyRzgQ==&version=590”

 

Benefits of using CDS (Customer Data Service):

  • Better Integration: CDS URLs are designed to work with all annotation tools output data, making it easier integrate data from different tools.

  • Secured: CDS URLs are built with expiration time, for added security.

  • Flexible: Using a CDS URL is a flexible alternative to store annotation as sizable and secure annotation data. Not only JSON data, but any sort of data, could be defined by a CDS URL.

  • Access control: Data in CDS has 1:1 relationship with Team ID. Only jobs under a permissible Team ID are authorized to access data in CDS using a short-lived link.

  • No Limit: The size of annotation data has no restrictions.

Back to top: Output columns


Was this article helpful?
16 out of 18 found this helpful


Have more questions? Submit a request
Powered by Zendesk