Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Data Quality Checks and Processes

Validating uploaded Data

For security reasons and to avoid the possibility of tampering with uploaded data, data files that are ingested through S3 buckets are moved immediately to a different location when uploaded. Upon successful upload of data files to the ingest bucket using the above step, data will be moved to the raw submissions bucket or “Data Lake.”

Background processes will move the data from ingest bucket to the raw submissions buckets under a folder labelled based on the date it was uploaded. As a data provider, you will have a folder in the data lake that contains all of the data you upload.

Local Object >> Ingest Bucket >> Raw Submissions Bucket

Data uploads can be verified by running the below AWS CLI command on the raw submissions bucket to list the objects there. The raw submissions bucket name is provided in the table below the command. The “project name” and “data provider name” were provided in the welcome email.

AWS CLI Command:

>aws s3 ls s3://<raw submissions bucket name>/<project name>/<data provider name>/<data type name>/ --profile sdc-token

  • No labels