(blue star) Frequently Asked Questions

Several datasets are available within SDC platform upon request. Once you are logged in, click on ‘Datasets’ in the top menu.

All the available datasets are listed under ‘SDC Datasets’.

To request access to a dataset, click on the ‘Request’ button. A form will pop up. Fill out the form and click on the ‘Send Request’ button

The request will be sent to the SDC support team and access to the requested dataset will be
given upon approval.

Click on Name of Dataset, you can see README of that particular dataset below it.

Click on ‘Workstations’ and click on the ‘Launch’ button of any workstation you want to access. Note: your workstation needs to be started before you will be able to log in. To start a workstation, click on ‘Start’.

For your workstation, you will be prompted with username and password to log in.

You can store your data in your team/individual bucket. Please refer to https://securedatacommons.atlassian.net/wiki/spaces/DESK/pages/2224128024/RT+Guide+Chapter+2+Initial+Setup+and+Validation#Upload-User-Data-to-S3-Bucket-through-Portal

Please refer to Upload User Data to S3 Bucket Through Portal to bring your own
datasets/algorithm to workstation.

https://securedatacommons.atlassian.net/wiki/spaces/DESK/pages/2224128024/RT+Guide+Chapter+2+Initial+Setup+and+Validation#Upload-User-Data-to-S3-Bucket-through-Portal

Follow the below steps to publish your datasets / algorithms and share with other SDC users.

  1. Navigate to the Datasets page.

  2. Click on the Publish button for the dataset/algorithm you wish to publish.

  3. In the pop-up window, there are two options for the Type: either a Dataset or Algorithm.

  4. If Dataset is selected from the Type drop-down menu:

    a. Name - Name of the dataset, which you wish to call it. Users will see your dataset
    with this name under SDC Datasets section.
    b. Description - Provide a short description so users can get an idea about your dataset.
    c. File/folder name - Name of the file or folder where your dataset resides in your S3
    Bucket. We need this information, so the support team can publish this dataset and
    make it available to other users.
    d. Readme / Data dictionary file name - This file should provide detailed instructions
    about your dataset, how it was created or any relevant information that helps user to
    understand and use the dataset. Save this file in your home folder relative to the
    dataset file/folder name.
    e. Geographic scope - Indicate the geographic scope for your dataset whether it belongs
    to a specific state, region, country etc.
    f. Start/End Date for data availability - Provide the start and end dates of the data that
    belongs in your dataset. For example, your dataset may contain data from March
    2017 to August 2017.

  5. If Algorithm is selected from the Type drop-down menu:
    a. Name - Enter the name for your algorithm. Users will see your algorithm with this
    name under SDC Datasets section
    b. Description - Provide a short description about your algorithm
    c. File/Folder name - Name of the file or folder where your algorithm resides in your S3
    bucket. We need this information, so SDC support team can publish this algorithm
    and make it available to other users
    d. Readme / Data dictionary file name - This file should provide detailed instructions
    about your algorithm, how it was created or any relevant information that helps user
    to understand and use the algorithm. Save this file in your home folder relative to the
    algorithm file/folder name
    e. Programming Tools/language - Provide the details of programming tools and/or
    languages that were used to create this algorithm, so users can leverage the same to
    run your program.

Sample queries are provided for each of the datasets on a GitLab page (accessed from your SDC workstation) we have set up for code sharing:
https://gitlab.prod.sdc.dot.gov/Commons/