- Office hours: Mon 3:00-4:00p and Wed 4:00-5:00p in Halligan 210
- Contact: mhughes(AT)cs.tufts.edu for personal issues only (for almost all questions, use Piazza forums)

- Office hours: TODO

- Piazza Discussion Forum for COMP 136 Spring 2020
- Use discussion forums for any question of general interest!
- Ask clarification questions about any assignment (math homework and coding practicals)

## Course Overview

This course provides the theoretical and computational foundations for **probabilistic** machine learning. The focus is on probabilistic models, which are especially useful for any application where observed data could be noisy, sometimes missing, or not available in large quantities. We emphasize representing uncertainty with formal distributions and trying to average over these distributions when making decisions (as done in the Bayesian approach).

Models studied include: models for discrete data, models for classification and regression, mixture models, models for sequences, and general graphical models.

Algorithms studied include: gradient descent (first-order and second-order), expectation maximization, variational inference, and Markov chain Monte Carlo methods.

## Objectives

After completing this course, students will be able to:

- Demonstrate formal mathematical understanding of probabilistic models
- Given an applied data analysis task, select a relevant probabilistic model and train the model on a relevant dataset
- Analyze scalability considerations of common machine learning algorithms (including runtime and memory)

## Prerequisites

This course intends to provide students a solid foundation in statistical machine learning methods.

To achieve this objective, we expect students to be familiar with the following before taking the course:

- Probability theory
- e.g. you could explain the difference between a probability density function and a cumulative density function

- First-order gradient-based optimization
- e.g. you could code up a simple gradient descent procedure in Python to find the minimum of functions like f(x) = x^2

- Basic supervised machine learning methods
- e.g. you can describe the difference between linear regression and logistic regression

- Basic matrix/vector algebra
- e.g. you could write the closed-form solution of least squares linear regression using basic matrix operations (multiply, inverse)

- Coding in Python with modern open-source data science libraries
- Basic array operations in numpy (computing inner products, inverting matrices, etc.)

- Making basic plots or grids of plots in matplotlib

- Training basic classifiers (like LogisticRegression) in scikit-learn

Practically, at Tufts this means having successfully completed at least one of:

- COMP 135 (Introduction to Machine Learning)
- EE 104 (Probabilistic Systems Analysis)
- OR permission of the instructor

With instructor permission, diligent students who are lacking in a few of these areas will hopefully be able to catch-up on core concepts via self study and thus still be able to complete the course effectively. Please see the community-sourced Resources Page for a list of potentially useful resources for self-study.

## Textbook

As a primary textbook, we will use "Pattern Recognition and Machine Learning" by Christopher M. Bishop.

A free PDF is available online from the author: <https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf>

Other suggested resources can be found on the [Resources page].

## What will we do in class?

There will be 5 topical units of the course, each lasting 3+ class meetings as listed on the schedule.

Each class meeting will be a combination of

- Lecture at the whiteboard, demonstrating key concepts as well as math analysis and derivation
- Small individual and group math exercises, to reinforce concepts and skills
- Occasionally, demonstrations of algorithm implementations in Python code

At the end of each topical unit, there will be:

- 1 quizlet, a short 15 min in-class "quiz" to verify your understanding of main concepts (mostly math, occasional programming question)

## What will we do outside of class?

Each of the 5 units will have the following assigned work outside of class:

- 1 written homework (HW), to build math skills (derivation and analysis)
- 1 coding practical (CP), to build implementation skills (auto-graded Python exercises + a short writeup with figures and analysis)

For a complete list of graded assigned work, see the Assignments page.

Additionally, there will be assigned reading from the textbook (and occasionally other high-quality sources) before each and every class meeting. Lecture will *reinforce* and *extend* the reading, but will rely on you doing the reading first. Some ideas from reading not presented in lecture may still appear in quizlets.

### Late work Policy

We want students to develop the skills of planning ahead and delivering work on time. We also want to be able to release solutions quickly and discuss recent work as soon as the next class meeting. With these goals in mind, and with the intention of making all homeworks and practicals due on Wednesday evenings, we have the following policy:

Each student will have 120 total late hours (5 late days) to use throughout the semester across all 5 homeworks and 5 coding practicals.

For each individual assignment (homework or coding practical), you can submit beyond the posted deadline at most 96 hours (4 days) and still receive full credit. Thus, for one assignment in the course due on Wed 11:59pm, you could submit the following Sunday at 11:59pm. And we could still discuss the assignment in class the following Monday safely.

The timestamp recorded on Gradescope will be official. Late time is rounded up to the nearest hour. For example, if the assignment is due at 3pm and you turn it in at 3:30pm, you have used one whole hour.

Beyond your allowance of 120 late hours, zero credit will be awarded except in cases of unforeseen exceptional circumstances (e.g. family emergency, medical emergency).

Students with exceptional circumstances that are documented and approved by their academic dean may meet with the professor to make other arrangements to the scheduled homework and exams.

## Exams

There will be two formal exams, a midterm and a final exam, each lasting 60+ minutes. See the Schedule for the specific dates, times, and durations.

The exams will test the key concepts covered up to that point in the course, with mostly a focus on mathematical analysis skills but perhaps a computational question or two.

The quizlets should be considered good preparation for the kinds of questions that might appear on an exam. Each quizlet will be officially scheduled on the course website and announced in class at least one week before it occurs.

## Grading

Final grades will be computed based on a numerical score via the following weighted average:

- 10% homeworks (averaged across all HWs)
- 15% coding practicals (averaged across all CPs)
- 30% quizlets (averaged across all units)
- 20% midterm exam score
- 25% final exam score

We will likely strongly consider dropping the lowest grades.

When assigning grades given a final numerical score (from 0 to 1), the following scale will be used:

- 0.93-1.00 : A
- 0.90-0.93 : A-
- 0.87-0.90 : B+
- 0.83-0.87 : B
- 0.80-0.83 : B-
- 0.77-0.80 : C+
- 0.73-0.77 : C
- 0.70-0.73 : C-

This means you must earn at least an 0.83 (not 0.825 or 0.8295) to earn a B. Any rounding up will be at the instructor's discretion, as will the highest possible grade of "A+".

## Collaboration Policy

For homeworks and coding practicals: we encourage you to work actively with a small group of other students, but you must be an active participant (asking questions, contributing ideas) throughout the process.

Along with all submitted work, you must include the names of *any* people you worked with, and in what way you worked them (discussed ideas, debugged math, team coding). We may occasionally check in with groups to ascertain that everyone in the group was participating in accordance with this policy.

## Academic Integrity Policy

This course will strictly follow the Academic Integrity Policy of Tufts University. Students are expected to finish course work independently when instructed, and to acknowledge all collaborators appropriately when group work is allowed. Submitted work should truthfully represent the time and effort applied.

Please refer to the Academic Integrity Policy at the following URL: https://students.tufts.edu/student-affairs/student-life-policies/academic-integrity-policy

## Accessibility

Tufts and the instruction team of COMP 136 strive to create a learning environment that is welcoming students of all backgrounds.

If you feel unwelcome for any reason, please talk to your instructor so we can work to make things better. If you feel uncomfortable talking to members of the teaching staff, consider reaching out to your academic advisor, the department chair, or your dean.

Please see the detailed accessibility policy at the following URL: <https://students.tufts.edu/student-accessibility-services>