Data Science

Although the topics below may seem like a lot, you do not need to be an expert in any of them in order to do research. A little familiarity with each topic will get you most (if not all) of the way there, so I try to only highlight the most important points.

  • Conda
    • essential for installing/managing software
  • Command-line interface
    • essential for interacting with remote servers, e.g. Princeton’s High Performance Computing (HPC) clusters
  • Python
    • an easy-to-read, general-purpose programming language that can help you at every stage of your research, including data gathering, statistical modeling, and visualization!
    • most popular language in data science!

The following topics are not as essential as the ones above but I strongly recommend learning them:

  • Visual Studio Code
    • code editor that makes life easier
    • intuitive, requires minimal setup, and has many useful extensions
  • Git and GitHub
    • essential for version control (saved snapshots of your code) and collaboration