ML/AI Self Study Links

If you are aware of good online resources that you think we should include, please email us.

The following websites are useful for people interested in machine learning with Python.

Minnesota Supercomputing Institute

The Minnesota Supercomputing Institute has a YouTube channel with a number of tutorials on it. In addition, they regularly offer tutorials and other events. Check out their webpage for current offerings.

Tutorials available on their YouTube channel include:


Below we provide a selection of resources for Python, a general purpose programming language that has become popular for use in data science. One reason for that popularity is the existence of a number of packages for Python that implement various mathematical, statistical, and machine learning-related functions. Another attractive feature of Python is the widespread use of Jupyter notebooks that allow for interactive programming in Python (as well as other languages). Jupyter notebooks are also an easy way to share code and create tutorials for various programming tasks. Of course, Python programs can also be created using standard editors used for programming languages. There will be more on this below.

General Python background

Getting Started with Python

The first task in getting started with Python is to get access. You can do this by installing it on a machine you have access to, such as your personal computer, or you can get access through a computing facility in your organization. 

Access Python without installing it on your personal machine

Option 1: Google Colab
This puts you directly into a Python notebook, which has a link to an introductory video, and to a new Jupyter notebook. Other links on the side of this notebook have links to examples, etc.  If you just want to see the webpage, you can hit the cancel button at the bottom of the notebook. Colab allows you to get started with coding in Python right away. Access to Python's packages and GPUs is provided in this environment, but it is not intended for running compute-intensive programs.

Option 2: MSI Jupyter Notebook Server
Those with an account at the Minnesota Supercomputing Institute can go directly to the Jupyter Notebook server. You must have an MSI account to use MSI resources. You must also connect either from a machine on a UMN network or via VPN. See the MSI website for additional documentation. (This link will also give you information about interactively running RStudio, MATLAB, and Mathematica.) You can perform moderate-size tasks using this option, but for programs that use more resources, you should create and submit jobs via the batch system at MSI.

Option 3: Code in the Cloud from Anaconda
Anaconda also provides an environment that you can use to install and use Python and its packages. If you just want to have access to a Jupyter notebook and play around with learning and light use of Python, this could be another possibility. You will have to create an account with Anaconda.

Install Python on your personal machine

We recommend installing Anaconda as it will install Python and also install other regularly used packages for scientific computing and data science. It will also help make sure all your packages remain compatible with one another. Much of the functionality of Python, like R, Matlab, and other languages, comes from the packages. 

The Anaconda website also provides a webpage for Getting started with Anaconda

Anaconda takes a fair amount of space on the disk and time to install, so if you have an older computer with limited disk space or a limited internet connection, you should consider lighter-weight options such as Miniconda.

Alternatively, you can download Python from  In that case, you will probably want to use a package manager such as pip, which comes with Python if you install it from This gives you more control but requires more understanding of Python, its packages, and how they are managed. 

Learning to program in Python

For those unfamiliar with Python 3 and Jupyter notebooks, this is a relatively easy environment to learn. Learning Python 3 will help you advance your knowledge of data analytics, as most big data platforms and data mining/machine learning projects require a working knowledge of Python. The following resources can help you learn Python and get started in using Python for machine learning.

Running Jupyter Notebooks

If you have installed Anaconda, you can run Anaconda Navigator and then click on the Jupyter notebook icon to start the Jupyter notebook. (Once you become familiar with a Jupyter notebook, you may want to run JupyterLab, a more advanced Notebook Interface.)  

You can also log onto the MSI Jupyter notebook server. 

General Machine Learning

General Science & Engineering Machine Learning Resources

  • Papers with Code is an excellent resource for accessing the latest research papers in machine learning and AI, with accompanying code implementations. It’s ideal for researchers who want to stay up-to-date with state-of-the-art methods and easily access implementations for experimentation.
  • Kaggle is one of the most popular platforms for machine learning competitions, where researchers, students, and practitioners can participate in real-world problems ranging from image classification to natural language processing and tabular data. It’s ideal for learners who want hands-on practice by solving actual problems and competing against other data scientists.
  • SciML Open Source Software for Scientific Machine Learning: Offers open source Julia language based codes for learning physical systems using machine learning.
  • The Crunch Group @ Brown (Youtube Channel): hosts weekly seminars on recent advancements in physics-informed machine learning.

Gaussian Processes

Gaussian Processes are a non-parametric supervised machine learning method for solving regression and probabilistic classification problems. The advantage of using a Gaussian Process (GP) is that it not only gives the prediction but also the uncertainty associated with a prediction as the variance of a Gaussian distribution. The experimental noise distribution can also be incorporated into the GP model. 

Deep Learning with Python

  • Deep Learning, by Goodfellow, Bengio, and Courville An online book that gives a good, although demanding, introduction to deep learning.
  • Dive into Deep Learning, by by Zhang, Lipton, Li, and Smola. This interactive online book includes concepts, exercises, and code.
  • Practical Deep Learning for Coders. This interactive online book also includes concepts, exercises, and code. It now has a part 2, From Deep Learning Foundations to Stable Diffusion
  • Deep Learning. This website features instructors Yann LeCun and Alfredo Canziani teaching a course on deep learning in Spring 2020 at the NYU Center for Data Science. It includes YouTube videos, slides, and Jupyter notebooks. The course concerns the latest techniques in deep learning and representation learning.
  • Welcome to the UVA Deep Learning Tutorials! The University of Amsterdam has a set of Python notebook tutorials for deep learning. These tutorial notebooks are fairly self-contained but are accompanied by videos available on YouTube.

Deep Neural Networks for Image Processing

Convolutional Neural Networks (CNNs) are a type of neural network used to model spatial (and sometimes temporal) data, most notably images. There are numerous types of CNN, as well as other deep learning architectures for image processing, e.g., Vision Transformers (ViT). 

  • But What is a Convolution?”  is a YouTube video by 3Blue1Brown introducing “Discrete convolutions, from probability to image processing and FFTs” with visualization. 
  • From Convolutions to Neural Networks is a webpage that explains convolutions and how sets of convolutions combine to create a convolutional neural network.t
  • Deep Residual Learning for Image Recognition The original ResNet paper. ResNet (Residual Network) is a type of CNN. The original paper makes a good, thorough introduction to the subject.
  • OpenMLab is an open-source computer vision algorithm system built on PyTorch. It supports a wide variety of models, including RetinaNet, Mask R-CNN, and others. Perfect for researchers and practitioners looking to try out a range of object detection architectures.
  • PyTorch Image Models (timm) is a popular repository providing a wide variety of image classification models, including EfficientNet, ResNet, Vision Transformers (ViT), and many others. It’s an excellent resource for training image classifiers using state-of-the-art architectures.
  • CycleGAN is a PyTorch implementation of CycleGAN, a popular architecture for unpaired image-to-image translation tasks. It’s widely used in tasks such as style transfer and domain adaptation.


Introduction to Kernel Density Estimation (KDE)

Data Science Techniques & Tools in Earth & Environmental Science

  • Earth Lab offers a variety of free books, workshops, and online courses specially designed for Earth Data Science. They cover a wide range of topics, from introductory programming to advanced machine learning techniques.


  • Bokeh provides an introduction to Bokeh, a plotting library. It’s less mature than matplotlib but has a lot of nice features for interactive plots, with the possibility of doing some very advanced stuff.
  • Singular value decomposition Tries to build an intuition for SVDs and justify why you might use the technique, and also goes into some of the formal details.
  • Importance sampling Chapter of a book in progress from a stats professor at Stanford that covers many aspects of the subject in good detail.
  • Browser neural network An in-browser interactive neural network that introduces and visualizes the different aspects of a basic model. Good for getting a feel for how different design choices affect the output.