RT Guide: Chapter 4, Exporting Datasets & Tables from the SDC
Exporting Datasets from the SDC
Researchers should be able to export the data of the system based on the compliance and data
usage policies set forth by a Data Provider.
There are two different types of Researchers:
General Researcher : This type of Researcher must provide justification to the Data Provider for
each data product that they want to export out of the SDC system. The intent is to ensure
that the Data Provider has oversight of the exported data. This type of Researcher can also
request trusted status from the Data Provider while filling out the approval form.Trusted Researcher : This type of Researcher already has a trusted status which is provided by
the Data Providers. The intent is to reduce the effort for exporting data products of
analyses out of the SDC system. A trusted user has a pre-existing and approved
relationship with the Data Provider. Trusted status is typically granted in extremely limited cases, at the discretion of the Data Provider.
Once the Researcher completes creating derived datasets, either working on the SDC datasets
or combining with other datasets that they import into the system, they can export the derived datasets or share the datasets with other team members.
Navigate to Datasets Tab
After Logging to SDC portal, navigate to My resources> My Data tab as shown in figure below. The highlighted button is for exporting the data out of SDC.
Request to Export Data
The following are the steps that the Researcher needs to follow to export the data of their
analysis from the SDC system to support their research:
Each Researcher is part of a team bucket which is displayed in the Datasets section.
When ready to export, Researchers can select the file (or files) that they want to export
out of the SDC system and place them in a separate staging folder (i.e., export_requests)
in their team bucket. Researchers can request for exporting a file in this folder by
clicking on the export symbol for the file they want to export out of the SDC system.
Please note that if Researchers want to print out a hard copy of a document, they will
need to export it from the SDC workstation to their local machine.Once the export button is selected, a dialog box for “Request to Export Data” will be displayed. The Researcher will then need to provide the details of the Project, Data Provider, and Data Type etc.
Please select “ I have and accept the Acceptable Use Policy” radio button.
Click on the “Submit Request” button once finished.
Acceptable Use Policy
If not previously awarded 'Trusted User Status'* for the selected dataset, users must accept the Acceptable Usage policy for the export request to go through to the Data Provider. The form will not be submitted if the user declines.
(*see the Request Trusted User Status section below for more information.)Upon successful submission, the export request will be sent to appropriate Data
Providers. Data Providers will be responsible for accepting or rejecting the export
requests.Once Data Providers approve the request, Researchers will be able to download the
dataset out of SDC through the portal.
Download Your Approved Dataset
If your request has been approved by the Data Provider, you will see, in the ‘My Data’ table, that the ‘Request Export Status’ status has been updated from ‘Submitted’ to ‘Approved’.
At this point. you can download your dataset by selecting the corresponding checkbox in the left-hand column of the ‘My Data’ table and then clicking on the “Download Selected” button.
Note: The text on the “Download Selected” button will update to reflect the number of approved dataset files you select for download. For example:
Request Trusted User Status
If the user is not a trusted user, he/she has the option to request trusted status from the Data Owner/ Data Steward. This will allow the Researcher to export the data immediately, as opposed to waiting for review and approval from the Data Owner/Steward. Trusted status is typically granted in extremely limited cases, at the discretion of the Data Owner/Steward.
1. The form to request Trusted User Status can be opened from the Request Center. Click on the ‘Request Trusted User Status’ button to open the dialog box.
2. The Researcher will then need to select the details of the Project, Data Provider, and Data Type for which he/she would like to request Trusted Status, as well as enter his/her reason for the request into the ‘Justification’ field'.
3. The Researcher must also indicate that he/she has read and accepts the Acceptable Use Policy.
4. Once all the required information has been selected and entered, the submit request button will activate so the request may be submitted to the Data Owner/ Data Steward.
Exporting Tables to the Edge Database
Researchers can also export summarized, aggregated and/or analysis results data, which do not contain sensitive information, to the public Edge database. This enables connections from the Edge database to local applications and/or other public data sources which are located outside of your SDC workstation.
Researchers are required to submit requests prior to exporting a new table for review and approval by the Data Steward. Data Steward review and approval is also required given any changes to a previously approved table’s schema. In this case, a new Table Export request should be submitted to identify the change.
Submit a Request to Export Table Data to the Edge Database
To submit a request to export your table data:
Navigate to the ‘Request Center’ . This is located just below the ‘Request Trusted User Status’ button.
Click on ‘Request to Export Table to Edge Database’ button to open a request form.
Fill in the required fields on the form.
The name of your database will appear pre-populated in the form.
Provide the name of the table to be exported to the public Edge database.
E.g. wazedata_test01Select from the cascading drop-down menus to identify the primary Project/Dataset, Data Provider, and Sub-Dataset/DataType used to create your table.
E.g. Waze (Jams, Irregularity or Alert)List any additional data sources that are used to create your table dataset.
Provide justification for your request.
Click the check box to indicate your review of and agreement with the ‘Acceptable Use Policy’.
Once all of the required information has been provided, click the 'Submit' button. This will send your request to the Data Owner/ Data Steward for review.
When the Data Steward completes their review process, you will receive an email notification confirming whether your request was approved or denied.
Connect to Databases Using DBeaver
The Edge Database feature is architected with AWS Aurora (PostgreSQL compatible) database instances.
Endpoint connection strings for both the internal and external (Edge) databases will be provided to Researchers by the Enablement Team.
Complete the following pre-requisite steps to establish JDBC connection to either Aurora database endpoint using DBeaver.
Configure PostgreSQL Driver
Open DBeaver.
In the DBeaver menu, open Database → Driver Manager
In the Driver Manager dialog box, select PostgreSQL, then click Edit…
In Edit Driver ‘PostgreSQL' dialog box, click the Libraries tab
Select and Delete the following
org.postgresql:postgresql:RELEASEnet.postgis:postgis-jdbc:RELEASEnet.postgis:postgis-geometry:RELEASEClick Add Folder
In the Open driver directory dialog box, navigate to:
C:\Users\Public\JDBC Drivers\PostgreSQLand click Select Folder.Click Find Class.
From the Driver Class dropdown, select
org.postgresql.DriverClick OK.
Click Close
Create a New Database Connection to the Edge Database
In the DBeaver toolbar, click the New Database Connection button
In the Connect to a database/Select your database dialog box, search for and select PostgreSQL
Click Next
In the Connect to a database/Connection Settings dialog box, on the Main tab, enter the Host (endpoint), Port, Database, Username and Password that were provided to you by the Enablement Team
Click Test Connection
You should see a new Connection test dialog box giving information about the server. Similar to the following:
If you get an error, re-check the information you entered on step 4.
f you continue to get an error, please contact the Enablement Team via the Service Desk.
Connecting to Tableau for Visualizations
This guide assumes you have Tableau Desktop installed on a DOT system with access to the DOT Common Operating Environment. Note: this will not work with Tableau Reader.
After launching Tableau Desktop, you can connect to a server, select more, and select ‘PostgreSQL’ (The same options will show if you select ‘Connect to Data’ from the workbook view:
After selecting PostgreSQL, you will be required to fill in the connection details. Similar to the DBeaver connection above, you will need to provide the server , port, database name, username, and password. Note: the server is provided without any slashes ( / ) or colons ( : ).
server: edgedb-externaldb.cluster-cuwig46oq690.us-east-1.rds.amazonaws.com
port: 5432
database, username, and password are provided to you by the sdc support team
After clicking on ‘Sign In’, Tableau will attempt to connect to the database and you will be presented with the option to select tables for use in your visualizations