Introduction and Document Overview
The Secure Data Commons (SDC) is a United States Department of Transportation (U.S DOT) sponsored cloud-based analytical sandbox designed to create wider access to sensitive transportation data sets, with the goal of advancing the state of the art of transportation research and state/local traffic management.
The SDC stores sensitive transportation data made available by participating data providers, and grants access to approved researchers to these datasets. The SDC also provides access to opensource tools, and allow researchers to collaborate and share code with other system users.
The SDC is a research environment that allows users to conduct analyses and do development and testing of new tools and software products. It is not intended to be an alternative to any local jurisdiction’s traffic management center or local data repository. The existing SDC provides users with the following data, tools, and features:
Data: The SDC is ingesting several datasets currently. Additional data sets will be added to the environment over time. Users can bring their own data into the environment to use along with the Waze data.
Tools: The environment provides access to open source tools including Python, RStudio, Microsoft R, SQL Workbench, Power BI, and Jupyter Notebook. These tools are available on a virtual machine in the SDC enabling data analytics in the cloud.
Functionality: Users can access and analyze data within the SDC, save their work to a virtual machine, and publish processes and results to share with others.
Roles
Data Providers: These are entities that provide data hosted on the SDC. The data provider stablishes the data protection needs and acceptable use terms for the data analysts.
Researchers: These are users that conduct analysis of the datasets hosted on the SDC. Note that researchers can bring their own data and tools into the SDC system.
This document provides guidance for data providers.
A similar guide has been prepared for the researchers which can be accessed here: RT Guide: Researchers' User Guide
Data Agreement
Data agreement is essentially an agreement between SDC and the Data provider. All new data providers are required to fill out the below form
DP Form: Data Provider Agreement
This agreements talks about following points :-
Dataset Information
Contextual Description of Dataset
Data Source
Size and Frequency of the Dataset
Data Ingest Requirements
Data with Personally Identifiable Information (PII)
Confidential Business Information
Other Sensitive Data/Restrictions
Data Use Agreement
Link to Data Dictionary
Quality Assurance Documentation