Follow

Dedicated Secure Data Access - AWS Integration

Important note: This article contains information regarding our Dedicated product. If you are looking for documentation regarding the cloud data annotation platform, please refer to this Success Center article.

Overview

When utilizing Dedicated’s Secure Data Access, your team connects your installation directly to your AWS S3 buckets and data can be uploaded into the platform through S3 file paths rather than URLs. 

Your team serves the source data via secure URLs hosted in private buckets inside your cloud storage. The only data that is passed to Dedicated are the URLs for your private bucket, which will be assigned a unit ID. Corresponding annotations for the data can be downloaded from Dedicated and can subsequently be associated with source data via the unit ID. 

  • Secure content is rendered through signed URLs 
  • Signed URLs expire immediately after the content is rendered 
  • Your content is never stored or saved within the Dedicated product 
  • Content is rendered only to authenticated contributors and requestors with access to specific tasks.  

Note: Secure Data Access is set-up at an Organization level and applies to all jobs created within the organization. Full set-up of the feature will require backend work detailed below.

Enabling Secure Data Access in an Organization

  • Locate organization to enable Secure Data Access.

  • As an Org Administrator, select Enable Secure Data Access

SDA.png

  • Secure Data Access is now enabled for the Organization!

S3 Bucket Preparation

  • Within S3, create a new bucket or locate an existing bucket.
    • Ensure the bucket used for Secure Data Integration is configured with permissions to access your Dedicated application.
    • Important note: For image annotation, pixel-level segmentation, and text annotation uses cases, S3 buckets will be required to be CORS configured.

blobid1.png

blobid2.png

Figure 1. Create new or locate existing S3 bucket

Create IAM Policy

  • Under Services, navigate to IAM dashboard, select "Policies" on the left and "Create policy"

blobid3.png

Figure 2. Find IAM Dashboard 

Input JSON

  • Within the JSON editor, copy, paste, and modify the JSON below to add relevant buckets:
{ 
    "Version": "2012-10-17", 
    "Statement": [ 
        { 
            "Sid": "AllowReadOnlyOperations", 
            "Effect": "Allow", 
            "Action": [ 
                "s3:GetObject", 
                "s3:ListBucket" 
            ], 
            "Resource": [ 
                "arn:aws:s3:::s3BucketName", 
                "arn:aws:s3:::s3BucketName/*" 
            ] 
        } 
    ] 
} 

blobid4.png

Figure 3. JSON Policy

  • Name this policy to reference later when creating the IAM role.
  • Click on "Create policy" when complete.
    • Note: Additional policies can be created and associated with the IAM role created below to add new buckets to the Secure Data Access integration. To disconnect buckets, remove the relevant policies from the IAM role.

Create IAM Role

  • Back on the main page of IAM, select "Roles" on the left and "Create role".
  • Select AWS service as a type of trusted entity, S3 as the service that will use this role, and S3 as the use case.
  • Move on to the "Permissions" when complete.

blobid5.png

blobid6.png

Figure 4. Create role

Link IAM Role to IAM Policy

  • Under "Attach permissions policies", find the IAM policy name created in the previous step.

blobid7.png

Figure 5. Attach permissions policies

  • Name the IAM role.

blobid8.png

Figure 6. Name IAM role

Configuring S3 Bucket Region in Appen

  • By default, the S3 region in Dedicated is set to us-east-1. If your desired S3 bucket is already located within the us-east-1 region, please move on to the next section.
  • If you need to update the default S3 bucket region in Dedicated, please follow the steps below.
    1. As a System Administrator, connect to the cds database.
    2. Once connected, run the following command with your AWS bucket region:
      • UPDATE storage_providers SET region = '<S3 bucket region>';
      • The region should be all lowercase in kebab case format (e.g. eu-west-1).
    3. Secure Data Access is now successfully configured to your desired region.

Data & Security in Appen

  • Within Dedicated, Secure Data Access leverages an internal bucket and does need to be configured on this page.
  • Please disregard this section in the Account Page -> Data & Security tab.

Upload Data with Secure Data Access Links

  • To use SDA hosted links, upload a CSV or URLs in the following format:
    • s3://s3BucketName/bucketFilePath/fileName.fileType 
    • Note: ~/fileName.fileType should match the object name in your AWS bucket. A file type extension is not needed if there is no file type extension in your object name.

Finishing Touch in CML

  • As a final step, navigate to your job's Design Page and update your column references in liquid with the following format:
    • {{ columnName | secure: 'internal'}} 
    • Important note: When using videos with Secure Data Access make sure to include the following tag in the CML section of your job: preload="auto". 
  • For confirmation, you should be able to view your hosted data within the Preview Page but not outside of Dedicated.

Additional Instructions:

  • You can update or delete an existing storage integration.
    • Modifying AWS Resource Name or AWS Region Name will break the existing integration.
    • All additional Secure Data Access integrations must be in the buckets within the same region. Multi-region integrations are not currently supported.
  • AWS S3 is currently the only supported storage provider integration.

Was this article helpful?
6 out of 6 found this helpful


Have more questions? Submit a request
Powered by Zendesk