Jump to: Software • Jupyter Notebooks • Self-Study Resources • Related Courses • Machine Learning at Tufts
Python Setup
For detailed instructions, see the [Python Setup Instructions page]
Starter code is available in our public Github repository: https://github.com/tufts-ml-courses/cs136-24s-assignments
Python Resources
To gain some fundamental Python skills (assuming you know other programming), we recommend:
Jupyter Notebooks
For many in-class breakout sessions, you'll want to work through a provided notebook, distributed as a `.ipynb' file from our course starter code repository on Github.
You'll want to download this file and run it on your machine.
To launch a specific notebook file named MyNotebook.ipynb
, here's what you'll do in your Terminal (Linux/Mac) or Command Prompt (Windows):
# Before we can start, be sure your current directory contains `MyNotebook.ipynb`
# First, activate our course conda environment
$ conda activate spr_2024s_env
# Second, launch the notebook server and direct it to open `MyNotebook.ipynb`
$ jupyter notebook MyNotebook.ipynb
# Should automatically open a browser and take you to an interactive notebook session. Or click `localhost:8888` link below.
For more help on launching a notebook, see Jupyter notebook documentation
Jupyter Resources
If you don't know much about Jupyter, the resources below might be helpful
How to download a Jupyter notebook and open it in your browser
'Play with Data in Jupyter' lessons by Lorena Barba
Self-Study Resources
Here are some useful resources to help you catch up if you are missing some of the pre-requisite knowledge. Please contribute new resources by starting a topic on the class discussion forum.
Probability
- Key concepts:
-
- Sum rule and product rule of probability
-
- Bayes theorem and associated algebra
-
- Continuous and discrete random variables
- Litmus test:
-
- Do the lecture notes from day01.pdf seem familiar? Do you have enough prevous experience and math sophistication to follow these easily?
-
Possible resources:
-
- Probability review notes from Prof. David Blei: http://www.cs.columbia.edu/~blei/fogm/2016F/doc/probability_review.pdf
-
- Stanford CS229 notes on Gaussian distributions: http://cs229.stanford.edu/section/gaussians.pdf
First-order gradient-based optimization
- Key concepts:
-
- Gradient descent
-
- Learning rates
-
- Difference between convex and non-convex functions for minimization
- Litmus test:
-
- Could you fit a linear regression model via gradient descent? (see notebook below).
- Possible resources:
-
- Convex Optimization overview for Stanford CS229: http://cs229.stanford.edu/section/cs229-cvxopt.pdf
-
- Jupyter notebook on 'Linear Regression with NumPy' (fits linear model with gradient descent): https://www.cs.toronto.edu/~frossard/post/linear_regression/
Linear algebra
- Key concepts:
-
- matrix multiplication
-
- matrix inversion
-
- positive (semi) definite matrices
-
- symmetric matrices
-
Quick review fact sheets:
- Fact-sheet on positive definite matrices: https://sites.calvin.edu/scofield/courses/m355/handouts/definiteMatrices.pdf
-
Review sheet from Stanford's CS 229 Intro ML course: http://cs229.stanford.edu/summer2020/cs229-linalg.pdf
-
Possible textbook-like resources:
-
- Goodfellow et al's chapter on Linear Algebra: http://www.deeplearningbook.org/contents/linear_algebra.html
-
- Immersive Linear Algebra: http://immersivemath.com/ila/
-
- Essence of Linear Algebra videos: https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab
-
- 'Computational Linear Algebra for Coders' course by fast.ai: https://github.com/fastai/numerical-linear-algebra/blob/master/README.md
Basic supervised machine learning methods
- Key concepts:
-
- Linear regression
-
- Logistic regression
Related Courses
Statistical Pattern Recognition (CS 136) at Tufts
Previous offerings:
- 2023 spring, taught by Prof. Mike Hughes
- 2022 spring, taught by Isaac Lage
- 2020 spring, taught by Prof. Mike Hughes
- 2019 spring, taught by Rishit Sheth, Ph.D.
- 2017 fall and earlier, with Prof. Roni Khardon (sorry, website no longer available)
Related courses at other universities
- Probabilistic Learning: Theory and Algorithms at UC-Irvine, taught by Prof. Padhraic Smyth
-
- Lots of useful lecture notes!
-
Statistical Machine Learning at UMass-Amherst, taught by Prof. Justin Domke
-
Foundations of Graphical Models at Columbia, taught by Prof. David Blei
Machine Learning at Tufts
For machine learning research activity at Tufts, see the ML Research Group Website:
For a recent listing of ML courses, see:
For current ML research opportunities for students, see: