Jump to: Software • Jupyter Notebooks • Related Courses • Machine Learning at Tufts • Self-Study Resources
Software
For this course, we require programming assignments to be implemented in Python 3.6+. Using a consistent language allows us to talk about implementation details in class and makes grading solutions more consistent and time-efficient.
Students are responsible for maintaining their own software environment on their personal computer. We highly suggest that you consider the free 'conda' environment manager from Anaconda, Inc.: https://conda.io/docs/user-guide/getting-started.html
For detailed instructions, see the [Python Setup Instructions page]
Starter code is all available in our public Github repository: https://github.com/tufts-ml-courses/comp136-21s-assignments
Jupyter Notebooks
For many in-class breakout sessions, you'll want to work through a provided notebook, distributed as a `.ipynb' file from our course starter code repository on Github.
You'll want to download this file and run it on your machine.
To launch a specific notebook file named MyNotebook.ipynb
, here's what you'll do in your Terminal (Linux/Mac) or Command Prompt (Windows):
# Before we can start, be sure your current directory contains `MyNotebook.ipynb`
# First, activate our course conda environment
$ conda activate spr_2021s_env
# Second, launch the notebook server and direct it to open `MyNotebook.ipynb`
$ jupyter notebook MyNotebook.ipynb
# Should automatically open a browser and take you to an interactive notebook session. Or click `localhost:8888` link below.
For more help on launching a notebook, see Jupyter notebook documentation
Jupyter Resources
If you don't know much about Jupyter, the resources below might be helpful
How to download a Jupyter notebook and open it in your browser
'Play with Data in Jupyter' lessons by Lorena Barba
Python Resources
To gain some fundamental Python skills (assuming you know other programming), we recommend:
Related Courses
Statistical Pattern Recognition (COMP 136) at Tufts
Previous offerings:
- 2020 fall, taught by Prof. Mike Hughes
- 2019 spring, taught by Rishit Sheth, Ph.D.
- 2017 fall and earlier, with Prof. Roni Khardon (sorry, website no longer available)
Related courses at other universities
- Probabilistic Learning: Theory and Algorithms at UC-Irvine, taught by Prof. Padhraic Smyth
-
- Lots of useful lecture notes!
-
Statistical Machine Learning at UMass-Amherst, taught by Prof. Justin Domke
-
Foundations of Graphical Models at Columbia, taught by Prof. David Blei
Machine Learning at Tufts
For machine learning research activity at Tufts, see the ML Research Group Website:
For a recent listing of ML courses, see:
For current ML research opportunities for students, see:
Self-Study Resources
Here are some useful resources to help you catch up if you are missing some of the pre-requisite knowledge. Please contribute new resources by starting a topic on the class discussion forum.
Probability
- Key concepts:
-
- Sum rule and product rule of probability
-
- Bayes theorem and associated algebra
-
- Continuous and discrete random variables
- Litmus test:
-
- Do the lecture notes from day01.pdf seem familiar? Do you have enough prevous experience and math sophistication to follow these easily?
-
Possible resources:
-
- Probability review notes from Prof. David Blei: http://www.cs.columbia.edu/~blei/fogm/2016F/doc/probability_review.pdf
-
- Stanford CS229 notes on Gaussian distributions: http://cs229.stanford.edu/section/gaussians.pdf
First-order gradient-based optimization
- Key concepts:
-
- Gradient descent
-
- Learning rates
-
- Difference between convex and non-convex functions for minimization
- Litmus test:
-
- Could you fit a linear regression model via gradient descent? (see notebook below).
- Possible resources:
-
- Convex Optimization overview for Stanford CS229: http://cs229.stanford.edu/section/cs229-cvxopt.pdf
-
- Jupyter notebook on 'Linear Regression with NumPy' (fits linear model with gradient descent): https://www.cs.toronto.edu/~frossard/post/linear_regression/
Linear algebra
- Key concepts:
-
- matrix multiplication
-
- matrix inversion
-
- positive (semi) definite matrices
-
- symmetric matrices
-
Quick review fact sheets:
- Fact-sheet on positive definite matrices: https://sites.calvin.edu/scofield/courses/m355/handouts/definiteMatrices.pdf
-
Review sheet from Stanford's CS 229 Intro ML course: http://cs229.stanford.edu/summer2020/cs229-linalg.pdf
-
Possible textbook-like resources:
-
- Goodfellow et al's chapter on Linear Algebra: http://www.deeplearningbook.org/contents/linear_algebra.html
-
- Immersive Linear Algebra: http://immersivemath.com/ila/
-
- Essence of Linear Algebra videos: https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab
-
- 'Computational Linear Algebra for Coders' course by fast.ai: https://github.com/fastai/numerical-linear-algebra/blob/master/README.md
Basic supervised machine learning methods
- Key concepts:
-
- Linear regression
-
- Logistic regression