Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Launch SQL Workbench by double-clicking the SQL Workbench shortcut on the desktop.

...

Create a Redshift connection profile to connect to Waze data:

  1. Create a new connection profile by selecting the top left corner icon on the “Select
    Connection Profile” window.

  2. Select “Amazon Redshift Driver” from the Driver drop-down.

  3. Update the URL section with the Redshift URL provided in the email from the
    support desk detailing Redshift login credentials.

  4. Provide your username and password received in the welcome email.

  5. Click on the Test button at the bottom to test the connection. A pop-up dialog will
    appear confirming a successful or failed connection.

...

Connecting to Waze Data in Redshift Using Python

NOTE: When you are granted access to Waze data, the SDC support team creates a new
Redshift user for you, assigns it with a Redshift password, and emails you with information on
the Redshift host you will connect to. This email message provides the redshiftHost,
userName, and userPassword values shown below. Your Redshift credentials are only used
for connecting to Redshift and NOT for accessing the portal, which uses your separate SDC
credentials.
Important: The default version of Python installed on the SDC Windows Workstations is
v2.7.4. There are two required Python modules that must be installed prior to attempting to
connect to Redshift with Python using the example code below. To install these modules, open a
Windows Command Prompt, and enter the following two commands:


C:\Users\username> pip install psycopg2
C:\Users\username> pip install numpy


The above "pip install …" command(s) only need to be run ONCE on the SDC Windows
Workstation. Once the Python modules are installed, they remain available, even across reboots
of the workstation.
To test Python connectivity to Redshift, open the IDLE python editor and execute the following:

Code Block
from __future__ import print_function
import psycopg2
import numpy
dbName = 'dot_sdc_redshift_db'
redshiftHost = '[host address]'
redshiftPort = 5439
userName = '[username]'
userPassword = '[password]'
# query = 'select * from dw_waze.alert limit 10;'
query = "select * from dw_waze.alert where 
alert_type='ACCIDENT' and city = 'Severance, CO'"
conn = psycopg2.connect(
dbname=dbName,
host=redshiftHost,
port=redshiftPort,
user=userName,
password=userPassword)
cursor = conn.cursor()
cursor.execute(query)
result = cursor.fetchall()
result = numpy.array(result)
# print(result)
for r in result:
print (r[1], r[8], r[18], r[19], r[22], sep='\t')

For further information and examples, refer to the internal SDC GitLab collaboration site.

Connecting to the Hadoop Hive Metastore

Launch SQL Workbench by double-clicking on the SQL Workbench shortcut on the desktop:

  1. Create a new connection profile by selecting the top left corner icon on the “Select
    Connection Profile” window.

  2. Select “Hive JDBC” from the Driver drop-down.

  3. Update URL section with the Hive URL.

  4. Provide your username and password received in the welcome email. NOTE: You are
    not required to enter the “@internal.sdc.dot.gov” portion of your username to log on.

  5. Click on the Test button at the bottom to validate your connection. A pop-up dialog will
    appear confirming a successful or failed connection. If you continue running into a failed
    connection, contact the SDC support desk for assistance

Update Data Formatting Settings in SQL Workbench

...