This article applies to using annotation tools, regardless of whether SDA has been set up for your team or not.
Output Columns
The output column in annotation tool contains json objects containing the URLs as well as a few metadata fields which are tool specific. Below is an example of the updated output:
{
type: "shapes",
valueRef:
"jobs/2129786/annotations/040ef579-b460-41bf-938c-8b20f9111f77.json",
metadata: { number_of_shapes: 4 },
url: "https://requestor-proxy.appen.com/v1/redeem_token?token=+5ipEHeSkd4NNl0tVZoXVj820j+JExjuUiH7uc7nv8UNltxjGvv+oOfU+gNOKAUr9BOjvuR1Cmfw9xqTiYTrbRRRkQkR8p1I3YV7Body3f9Bj9DeF4SX2busY9F1CeswmyYbM/J6cyF5hddhAMJeZtLF0Uy+yDzmOxPgpWpkvWDo6d8wjg71543BZmu/KqDuQ9AaEb5VYUn8xOh+XJBfNWyfcmY+gafDi7nO5dzY9DBEjIEjM7knt1FSAERPbEvAxUevP1Z9T3egRnp6OJytX9rO/8oty8JCAeH/NTXmDzFiEJTJ5Agq8oX5wSq6YaM1gB4AieWPDkSmdiVK+w==--lfj3HKqXERhEosWl--J+rdIzb4LRCXUbqeQFzSAw==&version=1025",
urlExpiresAt: "2023-04-24T19:40:08Z",
}
As mentioned above, the json output can also contain a number of metadata objects depending on which annotation tool(s) you are using. The above example is a Shapes tool output that contains the metadata object with the key number_of_shapes
which corresponds to the total number of shape objects annotated with the tool for that row.
The updated json output now includes the urlExpiresAt
key. The job result links will expire 15 days after generation; to receive non-expired result links, please re-generate the result report and download the job report again. You can use the developer docs to regenerate the report and download the report. Alternatively you can use the UI to regenerate and download the report as shown below.
Authentication to Access URLs
URLs in the report require authentication.
See example below if you have scripts that operate on the output URLs:
To call - replace your requestor-proxy hostname to api-gateway:
https://requestor-proxy.appen.com/v1/redeem_token?token=gSz/nU4ULsQjpetkA5l5V1ZXgt04exB
EVr2DsgiqIsyvAd1QvnkiG9MQVhQXgGZmEpSS1E0EJd17QB4DNju1w8js45FOZq2xiV8zPdmKkgdjphrDRl3E8
topXMp0PydXUkV32zre1HJnp4dN0M8ZyDu5V+hlVJ2wbuz83qh3+raRWU510imK53HVn/RTlplqld4lDdlWeR8a
Mljf/88RQpvuFhJk6ckixhZclOSi21Iq5POEn1xeJKcxzjY4TzCpQvFQYEGsKS4VEq+J9eVUxvDRUHoZxDRwv02
Nc9Jg/ggbXEH7sxXPnA6zf3RVYlb8ZQ==--s5tHd/9d1AG1dASD--FtLuDsqPS8pa2CH7/p66Bg==&version=
285
Once that is complete, add the api key at the end:
https://api-beta.appen.com/v1/redeem_token?token=gSz/nU4ULsQjpetkA5l5V1ZXgt04exBEVr2Ds
giqIsyvAd1QvnkiG9MQVhQXgGZmEpSS1E0EJd17QB4DNju1w8js45FOZq2xiV8zPdmKkgdjphrDRl3E8topXMp
0PydXUkV32zre1HJnp4dN0M8ZyDu5V+hlVJ2wbuz83qh3+raRWU510imK53HVn/RTlplqld4lDdlWeR8aMljf/8
8RQpvuFhJk6ckixhZclOSi21Iq5POEn1xeJKcxzjY4TzCpQvFQYEGsKS4VEq+J9eVUxvDRUHoZxDRwv02Nc9Jg/
ggbXEH7sxXPnA6zf3RVYlb8ZQ==--s5tHd/9d1AG1dASD--FtLuDsqPS8pa2CH7/p66Bg==&version=285&key=
YOUR_API_KEY
Using ref-to-url to translate CDS refs into URLs
If units are stored in CDS but job design requires a URL, you can use a liquid filter to translate the CDS reference into a URL. The structure {{ data_column | ref_to_url }} will turn a CDS ref from data_column into a short-lived S3 link. In addition, you can use the /link endpoint in CDS to generate short-lived S3 links.
Example:
Unit data
script,audio_file
"Hey Siri, what time is it?", ""{ "type":"file_upload",
"valueRef":"/jobs/123456/annotations/00112233-4455-6677-8899-aabbccddeeff00.wav"
}""
CML code
<audio src="{{ audio_file | ref_to_url }}" />
HTML on the assignment page
<audio src="https://s3.amazon.com/cds/teams/00112233-4455-6677-8899-aabbccddeeff00/
jobs/123456/annotations/00112233-4455-6677-8899-aabbccddeeff00.wav?sig=xyz" />
Regenerating Expired Result Links
As mentioned above, job result urls for Annotation Tools expire 15 days after generation. Below is an alternative way to regenerate the expired links.
-
Retrieve the valueRef from the expired JSON result and the team ID of the job the results came from. You can get your team ID by clicking “Team Jobs” in your account and copying the id in the URL after https://client.appen.com/jobs?scope=
-
Retrieve your account api key
-
Use 3600 for expiration_ms as this is how long the new token is valid for
-
Create the following link:
-
team_id+
-
"&path="+
-
value_ref+
-
"&expires_in="+
-
expiration_ms +
-
"&key=" + api_key
-
This is the new link
-
Ping this new link to get a new Token
-
new_link = "https://api-beta.appen.com/v1/redeem_token?" + new_token
Python:
def regenerate_url(expired_result, team_id, expiration_ms=”3600”):
link = expired_result['url']
value_ref = expired_result['valueRef']
end_value = team_id+"&path="+value_ref+ "&expires_in="+expiration_ms
anno = "https://api-beta.appen.com/v1/request_token?team_id="+end_value
anno = anno + "&key=" + api_key
response = requests.get(anno)
new_token = response.text
new_link = "https://api-beta.appen.com/v1/redeem_token?" + new_token
return(new_link)
Alternative with curl:
curl -H "Authorization: Token token=MYTOKEN" “https://api-beta.appen.com/v1/request_token?team_id=MYTEAMID&path={valuerRef}&expires_in=3600”
Output:
token=JIA8XvqULSmgqR+pmABQ7hnWUO31u2FdLd5hXuT/BysnVFQfZAXOAq1vUMjUoQXzUFwtelMtMnH/SYMLgC0pR/yyf5Ti7ysBrH2sMpkZCt1U7ZZ2sJllrykjwA4LiQcH0zJq+Cki0I0sZpyneje5E/HhjYLdbN2P41PW0N/UxKW5XH9kKdU51bhlbfVhV3rCiS09JmKgMfMShOsxrYos3a3hk/U+dUEU7B+iejmAMHOPudZ6yw==--XWCIICf3jSKcTu47--tpT8DMuAagAnkn2/cyRzgQ==&version=590
Then you can pass that output to redeem_token endpoint to get the file:
curl -H "Authorization: Token token=MYTOKEN" “https://api-beta.appen.com/v1/redeem_token?token=JIA8XvqULSmgqR+pmABQ7hnWUO31u2FdLd5hXuT/BysnVFQfZAXOAq1vUMjUoQXzUFwtelMtMnH/SYMLgC0pR/yyf5Ti7ysBrH2sMpkZCt1U7ZZ2sJllrykjwA4LiQcH0zJq+Cki0I0sZpyneje5E/HhjYLdbN2P41PW0N/UxKW5XH9kKdU51bhlbfVhV3rCiS09JmKgMfMShOsxrYos3a3hk/U+dUEU7B+iejmAMHOPudZ6yw==--XWCIICf3jSKcTu47--tpT8DMuAagAnkn2/cyRzgQ==&version=590”
Benefits of using CDS (Customer Data Service):
-
Better Integration: CDS URLs' are designed to work with all annotation tools output data, which could make it easier for any tools to integrate data with each other.
-
Secured: CDS URL’s are built with expiration time, this makes it very secure for clients.
-
Flexible: Using a CDS URL is a flexible alternative to store annotation as sizable and secure annotation data. Not only JSON data, but any sort of data, could be defined by a CDS URL.
-
Access control: Data in CDS has 1:1 relationship with Team ID. Only jobs under a permissible Team ID are authorized to access data in CDS using a short-lived link.
-
No Limit: The size of annotation data has no restrictions.