Schedule


Jump to: [Unit 1: Discrete] - [Unit 2: Regression] - [Unit 3: MCMC] - [Unit 4: Mixtures] - [Unit 5: Time Series]

For any class day with assigned readings and lecture videos, you should complete them before the start of class on that date.

Schedule might change slightly as the semester goes on. Please check here regularly and refresh the page. Look for key announcements on Piazza.

Unit 1: Foundations for Discrete Data

Key ideas: Probability fundamentals, Maximum likelihood estimation, MAP estimation, Beta-Bernoulli and Dirichlet-Multinomial distributions, conjugacy

Models: Dirichlet-multinomial models for text

Math Practice: HW1, which covers joint, conditional, and marginal distributions; ML/MAP point estimation; Bayesian posterior estimation; Beta PDF and Gamma functions; conjugacy

Coding Practice: CP1 which covers text modeling with unigram distributions

Date Assigned Do Before Class Class Content Optional
Mon 02/01 day01
out:
- HW0
- HW1
- CP1
Readings:
- Bishop PRML Ch. 1 Sec. 1.1, 1.2
--- Focus on 1.2.1, 1.2.2, and 1.2.3
Notes: day01.pdf
Videos:
- day01 part1 : Random Variables and Probability
- day01 part2 : Joint, Conditional, Marginal
- day01 part3 : Sum Rule, Product Rule, Bayes Rule
- day01 part4 : Independence
- day01 part5 : Expectations, Mean, and Variance
Course Overview & Probability Refresher
- Articles motivating the probabilistic ML approach:
Wed 02/03 day02  
Readings:
- Bishop PRML Ch. 1 Sec. 1.2.3
--- Focus on maximum likelihood vs Bayesian approach
- Bishop PRML Ch. 2 Sec. 2.1
--- Focus on ML estimator Eq. 2.5-2.8
Notes: day02.pdf
Videos:
- day02 part1 : Bernoulli Distribution
- day02 part2 : A Spectrum of Models for Many Coin Flips
- day02 part3 : ML Estimation for iid Bernoulli
- day02 part4 : Pros and Cons of Maximum Likelihood
- day02 part5 : Continuous Random Variables, PDFs and CDFs
Maximum Likelihood for Binary Data
- MathForML Ch. 6 Sec. 6.2-6.3
--- Background on probability theory
Mon 02/08 day03  
Readings:
- Bishop PRML Ch. 2 Sec. 2.1
--- Focus on Beta and Bernoulli distrib.
Notes: day03.pdf
Videos:
- day03 part1 : Gamma Functions
- day03 part2 : Beta Distributions
- day03 part3 : Beta-Bernoulli model and its posterior
- day03 part4 : MAP Estimation
- day03 part5 : Posterior Predictive Distributions
Beta-Bernoulli models for Binary Data
- Background Knowledge Notebook: day03-GammaFunction.ipynb
- Breakout Exercises Notebook: day03-BetaDistribution.ipynb
 
Wed 02/10 day04
due:
- HW0
Readings:
- Bishop PRML Ch. 2 Sec. 2.2
--- Focus on Dirichlet distribution
- Bishop PRML Appendix E Lagrange Multipliers
Videos:
- day04 part1 : From Binary to Categorical Distributions
- day04 part2 : ML Estimation for Categorical Parameters
- day04 part3 : Dirichlet distributions
- day04 part4 : Dirichlet-Categorical model and its posterior
- day04 part5 : MAP Estimation for Dirichlet-Categorical
- day04B (i) : Lagrange multipliers for equality constraints: Recipe and example
- day04B (ii) : Lagrange multipliers for equality constraints: Why it works
Dirichlet-Categorical models for Count Data
- For more on Dirichlet, see Frigyik, Kapila, and Gupta 2010

Unit 2: Multivariate Gaussians and Regression

Key ideas: multivariate Gaussian distributions, model selection, Laplace approximation

Models: Bayesian linear regression, Bayesian logistic regression, generalized linear models

Algorithms: gradient descent, methods for model selection

Math Practice: HW2

Coding Practice: CP2

Date Assigned Do Before Class Class Content Optional
Tue 02/16 (Mon on Tues) day05  
Readings:
- Bishop PRML Ch. 1 Sec. 1.2.4
--- Focus Univariate Gaussians
--- ML estimators of mean and variance
- Bishop PRML Ch. 2 Sec. 2.3.1-2.3.2
--- Skim multivariate Gaussian properties
Notes: day05.pdf
Videos:
- day05 part1 : Univariate Gaussian distribution
- day05 part2 : ML Estimation for Gaussians
- day05 part3 : Biased vs Unbiased Estimators
- day05 part4 : Special Properties of Gaussians
- day05 part5 : Covariance and Covariance Matrices
Univariate Gaussians
 
Wed 02/17 day06
due:
- HW1
out:
- HW2
Readings:
- Bishop PRML Ch. 2 Sec. 2.3.1-2.3.5
--- Focus on multivariate Gaussian and its properties
- Bishop PRML Ch. 2 Sec. 2.3.5-2.3.6
--- Skim for intuition
Notes: day06.pdf
Videos:
- day06 part1 : Multivariate Gaussian distribution
- day06 part2 : Covariance Properties; Why Contours are Elliptical
- day06 part3 : Marginals of Gaussians are Gaussian
- day06 part4 : Conditionals of Gaussians are Gaussian
- day06 part5 : Linear-Gaussian models are Joint Gaussian
Multivariate Gaussians
- Immersive Linear Algebra: Determinants
- Immersive Linear Algebra: Eigenvalues and eigenvectors
Mon 02/22 day07
out:
- CP2
Readings:
--- Focus on ML estimator for linear regression
- Bishop PRML Ch. 3 Sec. 3.3
--- Focus on posterior distribution
Notes: day07.pdf
Videos:
- day07 part1 : Probabilistic view of linear regression
- day07 part2 : ML estimation of weights and precision
- day07 part3 : Posterior over weights
- day07 part4 : MAP estimation of weights
Bayesian Linear Regression 1/2
- Bishop PRML Ch. 3 Sec. 3.2
--- Bias/Variance tradeoff
Wed 02/24 day08
due:
- CP1
- Quiz1 (out Thu, due Fri)
Readings:
- Bishop PRML Ch. 3 Sec. 3.3
--- Focus on posterior predictive distribution
- Bishop PRML Ch. 3 Sec. 3.4 and 3.5
--- Focus on model selection and hyperparameter estimation
Notes: day08.pdf
Videos:
- day08 part1 : Posterior predictive
- day08 part2 : Evidence
- day08 part3 : Model selection
- day08 part4 : Hyperparameter estimation
Bayesian Linear Regression: Prediction and Model Selection
 
Mon 03/01 day09  
Readings:
- Bishop PRML Ch. 4 Sec. 4.3
--- Focus on linear models for binary and multi-class classification
Notes: day09.pdf
Videos:
- day09 part1 : Discriminative vs Generative models
- day09 part2 : Generalized linear models
- day09 part3 : Sigmoid function
- day09 part4 : Probabilistic Logistic Regression: ML and MAP strategies
- day09 part5 : 2nd-order gradient methods for Linear + Logistic Regression
Bayesian Generalized Linear Models for Classification and Beyond
- Bishop PRML Ch. 4 Sec. 4.2
--- Skim for Understanding generative classification
Wed 03/03 day10
due:
- HW2
Readings:
- Bishop PRML Ch. 4 Sec. 4.4
--- Try to understand the Laplace approximation
- Bishop PRML Ch. 4 Sec. 4.5
--- Bayesian Logistic Regression
Notes: day10.pdf
Videos:
- day10 part1 : Bayesian Logistic Regression overview
- day10 part2 : Laplace approximation in 1-dim and M-dims
- day10 part3 : Laplace approx. posterior for Logistic Regression
- day10 part4 : Predictive posteriors for Bayesian Logistic Regression
Posterior Estimation and Prediction for Bayesian Generalized Linear Models
- Highlights: day10 [slides]
 

Unit 3: Sampling and Markov Chain Monte Carlo

Key ideas: Markov chains, revisibility, ergodicity, detailed balance, probabilistic programming

Models: Bayesian logistic regression, general directed graphical models

Algorithms: Metropolis-Hastings algorithm, Gibbs sampling

Math Practice: HW3

Coding Practice: CP3

Date Assigned Do Before Class Class Content Optional
Mon 03/08 day11  
Readings:
- Bishop PRML Ch. 11 Sec. 11.1
--- Focus on basic methods using transformed uniform r.v. in 11.1.1
--- Skim rejection sampling in 11.1.2
--- Skim importance sampling in 11.1.4
- Bishop PRML Ch. 11 Sec. 11.2
--- Focus on the overview section and 11.2.1 Markov chains
Notes: day11.pdf
Videos:
- day11 part1 : Monte Carlo estimates of expectations
- day11 part2 : Directed graphical models & ancestral sampling
- day11 part3 : MCMC Intro, Stationary Distributions, Ergodicity
- day11 part4 : Sampling via Inverting the CDF
- day11 part5 : Transformations of Sampled Variables
Sampling Methods and Markov Chain Monte Carlo
- Highlights: day11 [slides]
 
Wed 03/10 day12
due:
- CP2
- Quiz2 (out Thu, due Fri)
Readings:
- Bishop PRML Ch. 11 Sec. 11.2.2
--- Focus on Metropolis-Hastings
Notes: day12.pdf
Videos:
- day12 part1 : Markov Transitions that Propose then Accept/Reject
- day12 part2 : Random walk proposals
- day12 part3 : Detailed Balance, Proof of Random Walk's Stationary Distribution
- day12 part4 : Metropolis and Metropolis-Hastings algorithms
Random Walk Proposals and Metropolis-Hastings Algorithm
- Highlights: day12 [slides]
 
Mon 03/15 day13 out: Midterm (take home)
Videos:
-
Midterm Review
 
Wed 03/17 day14  
Readings:
- Sec. 11.3 'Gibbs sampling' Bishop PRML Ch. 11
Notes: day14.pdf
Videos:
- day14 part1 : Overview of Sampling Vector Random Var.
- day14 part2 : Pro/con comparison of Gibbs vs. Random Walk
- day14 part3 : Gibbs Sampling Algorithm
- day14 part4 : Proof sketch of Gibbs Sampling correctness
Gibbs Sampling
 
Mon 03/22 day15 due: Midterm
Readings:
- Betancourt 2017 Skim Sec. 1-2, Read Sec 3
--- Get intuition for MCMC from Fig 3 and Fig 7
--- Try to understand why Hamiltonian MCMC explores better (see Fig 11)
Notes: None for today.
Videos:
--- Watch the first 42 min (can ignore Q&A at end)
Hamiltonian Monte Carlo
Wed 03/24 day16
due:
- HW3
Readings:
--- Read thru Case Study 1
Notes: None for today.
Videos:
--- Watch the first ~40 min (can ignore Q&A at end)
Probabilistic Programming
--- Skim, focus on examples

Unit 4: Clustering, Mixture Models, and Expectation-Maximization

Key ideas: coordinate ascent optimization, local optima, expectations

Models: Mixture models

Algorithms: k-means, expectation maximization

Math Practice: HW4

Coding Practice: CP4

Date Assigned Do Before Class Class Content Optional
Mon 03/29 day17  
Readings:
Notes: day17.pdf
Videos:
- day17 part1 : What is clustering?
- day17 part2 : K-means problem and cost function
- day17 part3 : K-means algorithm
- day17 part4 : Guarantees, convergence, and local optima
- day17 part5 : How to pick K?
K-Means Clustering
 
Wed 03/31 day18
due:
- CP3
- Quiz3 (out Thu, due Fri)
Readings:
- Bishop PRML Ch. 9 Sec. 9.2 - 9.2.1
--- GMMs in depth
- Bishop PRML Ch. 2 Sec. 2.3.9
--- Motivation for Gaussian mixtures
Notes: day18.pdf
Videos:
- day18 part1 : Why mixture models?
- day18 part2 : Gaussian mixture model (Two views with and without assignment variables)
- day18 part3 : Computing the Posterior over Assignments
- day18 part4 : Estimating Parameters via Maximum Likelihood (plus logsumexp trick)
- day18 part5 : Problems with ML Estimation for GMMs
Gaussian Mixture Models and ML Estimation
- Breakout: Get started on HW4 Prob 2 and 3
 
Mon 04/05 day19  
Readings:
- Bishop PRML Ch. 9 Sec. 9.2.2, 9.3.1, and 9.3.2
Notes: day19.pdf
Videos:
- day19 part1 : Penalized ML Optimization Problem for GMMs
- day19 part2 : Gradient Descent for GMMs
- day19 part3 : Derivation of Coordinate Descent for GMMs
- day19 part4 : The EM Coordinate Descent Algorithm for GMMs
ML Estimation with GMMs: Expectation Maximization and Gradient Descent
- Breakout: Get started on CP4 Prob 1
 
Wed 04/07
due:
- HW4
Readings:
- Bishop PRML Ch. 9 Sec. 9.3 and 9.4
Notes: day20.pdf
Videos:
- day20 part1 : Recap of GMMs
- day20 part2 : GMMs with Latent Assignments Z
- day20 part3 : Expectations of Complete Likelihood
- day20 part4 : Lower Bound of Incomplete Likelihood
- day20 part5 : EM as Coordinate Ascent on Lower Bound Objective
Expectation Maximization for GMMs
- Breakout: Get started on CP4 Prob 2
- Background on entropy: Sec. 1.6 of Bishop PRML Ch. 1
- Background on KL divergence: Sec. 1.6.1 of Bishop PRML Ch. 1

Unit 5: Hidden Markov models for Time-Series

Key ideas: dynamic programming, joint MAP vs. marginal MAP

Models: Hidden markov models

Algorithms: forward-backward algorithm, Viterbi algorithm, variational inference, belief propagation

Math Practice: HW5

Coding Practice: CP5

Date Assigned Do Before Class Class Content Optional
Mon 04/12 day21  
Readings:
- Bishop PRML Ch. 13 Sec. 13.1 'Markov models'
- Bishop PRML Ch. 13 Sec. 13.2 'Hidden Markov models'
--- Only the intro before 13.2.1
Notes: day21.pdf
Videos:
- day21 part1 : Unit 4 Motivation: Dependencies in Sequential Data
- day21 part2 : The Markov assumption
- day21 part3 : Markov models for discrete sequences
- day21 part4 : Hidden Markov models
Markov Models and Hidden Markov Models
 
Wed 04/14 day22
due:
- CP4
- Quiz4 CANCELLED
Readings:
- Bishop PRML Ch. 13 Sec. 13.2.1 'Maximum likelihood for the HMM'
- Bishop PRML Ch. 13 Sec. 13.2.2 'Forward-backward algorithm'
- Bishop PRML Ch. 13 Sec. 13.2.4 'Scaling factors'
--- Think about numerical stability of calculations
Notes: day22.pdf
Videos:
- day22 part1 : EM for HMM parameter estimation
- day22 part2 : Expected log likelihood for HMMs
- day22 part3 : M-step for HMMs
- day22 part4 : E-step for HMMs: overview
- day22 part5 : E-step for HMMs: forward algorithm and backward algorithm
Expectation-Maximization for HMMs
- Breakout: Get started on HW5
 
Mon 04/19     NO CLASS (Patriot's Day)  
Wed 04/21 day23
due:
- HW5
Readings:
- Bishop PRML Ch. 13 Sec. 13.2.5 'The Viterbi Algorithm'
- Bishop PRML Ch. 13 Sec. 13.2.6 'Extensions to the HMM'
Notes: day23.pdf
Videos:
- day23 part1 : Motivation for inferring hidden states
- day23 part2 : Most likely hidden sequence problem
- day23 part3 : Intuition behind recursive solution
- day23 part4 : Viterbi algorithm
Viterbi for HMMs
- Breakout: Get started on CP5
 
Mon 04/26 day24  
Readings:
- Read Sec. 10.1 of Bishop PRML Ch. 10
- Skim Sec. 10.2 of Bishop PRML Ch. 10
Notes: day24.pdf
Videos:
- day24 part1 : Variational methods
- day24 part2 : Possible optimization strategies for probabilistic models: MM, EM, ME, EE
- day24 part3 : Case study for GMMs: EM and EE side-by-side
- day24 part4 : Choosing the family of approximate posteriors
Variational Methods
Wed 04/28 day25
due:
- CP5
Readings:
- 'Automatic Variational Inference in Stan': Kucukelbir et al. NeurIPS '15
Notes: None
Videos:
--- Watch to gain high-level appreciation of variational methods
Frontiers of probabilistic modeling
Mon 05/03 day26
due:
- Quiz5 (out Tue, due Wed)
Notes: None
Final Exam Review
 
FINALS WEEK
due:
- Exam due by Thu 5/13 end-of-day
  FINAL EXAM