Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

Using Python

SDC Data Analysts use the Python programming language to analyze SDC data.  The Data Analysts need to be able to pull data from the SDC Data Lake into Python for analysis.

...

Working with Version control

You can find more information on Version control here - RT Guide: Chapter 5, Using GitLab

Accessing SDC data from your Workstation-Using Python to access data.

SDC researchers can access the data in two ways, using-

  • AWS CLI is installed on every SDC machine and can be used to access data.

  • Additionally users can download data on their SDC machine using Cyberduck. For complete description on how to use Cyberduck please click here RT Guide: Cyberduck User Guide

  • For additional questions please email sdc-support@dot.gov.

Using Anaconda

Anaconda Navigator is a desktop graphical user interface (GUI) included in Anaconda® Distribution that allows you to launch applications and manage conda packages, environments, and channels without using command line interface (CLI) commands. Navigator can search for packages on http://Anaconda.org or in a local Anaconda Repository.

How to start an Anaconda session.

  • After logging into your SDC Workstation open app Anaconda Navigator.

  • By default, all applications available to launch or install within Navigator are displayed on the Home page.

...

Working with Notebooks.

The Jupyter Notebook application allows you to create and edit documents that display the input and output of a Python or R language script.

...

  • You can now start running your code/commands in here using Python 3.

...

Working with Virtual environments

With Anaconda Navigator, you can create, export, list, remove, and update environments that have different versions of Python and/or other packages installed. Switching or moving between environments is called activating the environment. Only one environment is active at any point in time.

...

  • You have successfully created your own environment. You can create libraries and install packages specific to that environment. This allows you to have your own little project for the number of scripts that you are running.

  • For more documentation on managing environments please visit - https://docs.anaconda.com/free/navigator/tutorials/manage-environments/

Benefits of Anaconda for the SDC Data Analysts

  • Anaconda has a major advantage as it comes with many pre-installed packages generally used in machine learning and data science. This saves a lot of effort and time as one does not need to install each package separately.

  • With Anaconda you can create separate environments for different projects.

  • You can create notebooks within a virtual environment.

How to create and run your first python program (Hello World, for example)

https://docs.anaconda.com/free/anaconda/getting-started/hello-world/

References:

...