Miscellaneous

In this section we present some additional resources as well as a short list of relevant literature contributions.

Scientific Programming

Anaconda Distribution

The open-source Anaconda Individual Edition is the easiest way to perform Python/R data science and machine learning on Linux, Windows, and Mac OS X. With over 19 million users worldwide, it is the industry standard for developing, testing, and training on a single machine.

Jupyter Lab

Jupyter Lab is a web-based interactive development environment for Jupyter notebooks, code, and data. JupyterLab is flexible: configure and arrange the user interface to support a wide range of workflows in data science, scientific computing, and machine learning. JupyterLab is extensible and modular: write plugins that add new components and integrate with existing ones.

Resources

Version Control

The Introduction to Git and GitHub course on coursera offers a well rounded intro into version control using Git and GitHub. It will teach you all the fundamentals as well as more advanced features.

The GitKraken video tutorial series provides you with the knowledge needed to start using Git. It covers the absolute basics as well as advanced procedures involving Git. The course consists of 15 concise videos split into three difficulty levels, all of which can be watched in under an hour.

SciPy

SciPy is a Python-based ecosystem of open-source software for mathematics, science, and engineering. In particular, these are some of the core packages: numpy, scipy, matplotlib, and pandas.

SciPy Lecture Notes

The Scipy lecture notes provide an excellent starting point for everyone interested in scientific programming in Python. They cover the main scientific packages of the Python ecosystem, namely numpy, scipy and matplotlib. Each chapter corresponds to a 1 to 2 hour course with the level of expertise increasing from beginner to expert.

Resources

Software Carpentry

Software Carpentry teaches researchers the computing skills they need to get more done in less time and with less pain. They have a lot of useful lessons on many different topics like Python and R programming or the Unix shell.

statsmodels

statsmodels is a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration. An extensive list of result statistics are available for each estimator. The results are tested against existing statistical packages to ensure that they are correct.

Check out the online documentation.

Textbooks

Ramalho, L. (2015). Fluent Python. Clear, concise, and effective programming. O’Reilly Media, Inc., Sebastopol, CA.

Rossant, C. (2018). IPython interactive computing and visualization cookbook. Packt Publishing, Birmingham, England.

VanderPlas, J. (2016). Python data science handbook. O’Reilly Media, Inc., Sebastopol, CA.

Software Engineering

Testing

Automated testing ensure the quality of our research software.

Documentation

Proper documentation ensures that the research software is used as intended and helps in the recruitment of new contributors and users.

Continuous Integration

Continuous Integration is a development practice where developers integrate code into a shared repository frequently, preferably several times a day. Each integration can then be verified by an automated build and automated tests.