Comp 236: Computational Learning Theory
Department of Computer Science
Tufts University
Spring 2013

Announcement(s):
  • (3/11) Class canceled on Tuesday 3/26.
  • (3/4) Error in assignment 2 question 2. One less-than-or-equal sign was reversed in the original version. The correct equation to prove is . The assignment writeup linked below has been updated.

Syllabus: Probabilistic and adversarial models of machine learning. Development and analysis of machine learning principles and algorithms, their computational complexity, data complexity and convergence properties. Computational and cryptographic limitations on algorithms for machine learning. Core results and recent developments in this field.
Prerequisites: COMP 160; EE 104 or MATH 162; COMP 170 recommended but not required. Or permission of instructor.
Times and Location: Tuesday/Thursday 10:30-11:45, Halligan Hall 106
Instructor: Roni Khardon
Office: Halligan 230
Office Hours: Mon 4-5 or by arrangement
Phone: 1-617-627-5290
Email: roni@cs.tufts.edu

What is this course about? Machine learning algorithms aim to "make sense of data", often developing some generalizations that are useful for future tasks, for example in classification, prediction, and control. Computational Learning Theory investigates such tasks and algorithms in a formal framework. The focus is to identify performance guarantees for algorithms in terms of computational complexity (how long it takes to run), data complexity (how much data is required), and convergence properties (how well does the result/output of the learning algorithm perform). Alternatively, for some machine learning problems we seek lower bounds on the amount of resources for any potential algorithm. Models vary from adversarial worst case scenarios, to statistical settings where a random process generates the data, and to interactive settings where the learner can control the flow of data. The course will mix taught classes with seminar type reading of research papers of recent topics.
This semester: In addition to a review of core topics, this semester we will explore online convex optimization, bandit problems, PAC-Bayes theorems, learning theory for pattern recognition models (GMM, HMM), matrix completion problems, and maybe other topics.
For reference see details from 2008 offering

Activities & Grading

The course will mix taught classes with seminar type reading of research papers. Course work will include writing up official lecture notes, reading and presenting papers, a small number of homework assignments, and an in-class exam on Thursday 4/25. Details are as follows:

Policy on collaboration on homework assignments: You may discuss the problems and general ideas about their solutions with other students, and similarly you may consult other textbooks or the web. However, you must work out the details on your own and write the solution on your own. Every such collaboration (either getting help of giving help) and every use of text or electronic sources must be clearly cited and acknowledged in the submitted homework. Failure to follow these guidelines will result in disciplinary action for all parties involved. Any questions? for this and other issues concerning academic integrity please consult the booklet available from the office of the Dean of Student Affairs.

Texts/Reference

We will sample from different sources including some of the texts below (on reserve in library) and research and survey articles.

Papers for presentations and Discussion.

TBA. Some links: COLT 2012, ALT 2012, NIPS 2012

Schedule and Assignments

Type What Due Date
Lecture 1 Introduction to CLT  
Reading Sections 1.1-1.3 of K&V  
Lecture 2 PAC Learning Model; learning conjunctions  
Lecture 3 Tools for Probabilistic Analysis  
Reading Chapter 9 of K&V  
Lecture 4 Variations of PAC learning  
Reading Sections 4.1-2 of K&V (Optional: skim through 4.3.1-4.3.2)  
Assignment 1 Assignment 1 Feb 7
Reading Sections 1.4-1.5 of K&V  
Lecture 5 Proof of Hoeffding's Bound; Computational Hardness for PAC learning  
Lecture 6 Hardness Results for PAC learning  
Reading Chapter 2 of K&V  
Lecture 7 Occam's Razor; Compression schemes
We also used the companion slides
and pages 12-13 of this paper
 
Lecture 8 Discussion of problems in assignment 1.  
Lecture 9 Agnostic Learning and Structural Risk Minimization  
Reading Chapter 3 (3.1-3.3) of K&V  
Lecture 10 The VC dimension  
Reading Chapter 3 (3.3-3.7) of K&V  
Lecture 11 VC dimension and PAC Learning  
Lecture 12 Convergence based on VC Dimension we also used companion slides  
Assignment 2 Assignment 2 Mar 12
Lecture 13: Paper Presentations Random Classification Noise: Angluin & Laird
Malicious Noise: Kearns & Li
Learning via Statistical Queries: Kearns
 
Lecture 14 Mistake Bounded Learning and the Perceptron Algorithm  
Lecture 15 Perceptron with non-separable data and The Winnow Algorithm  
Assignment 3 Assignment 3 Apr 3
Sources for Online Convex Optimization Lectures Lectures follow material from
Kakade and Tewari lecture notes L3, and L4
and the survey article By Shalev-Schwartz
 
Lecture 16 OCO and Online Projected GD  
Lecture 17 FTL and FoReL Algorithms  
Lecture 18 Analyzing FoRel using Strong Convexity  
Lecture 19 Mirror Descent Algorithms  
Lecture 20 Learning with Queries  
Reading Sections 8.1-8.2 of K&V  
Lecture 21 Learning Horn Expressions with Queries  
Lecture 22: Paper Presentations: OCO Online Optimization with Gradual Variations (see also Commentary on "Online Optimization with Gradual Variations")

Near-Optimal Algorithms for Online Matrix Prediction (see also Commentary on "Near-Optimal Algorithms for Online Matrix Prediction")
 
Lecture 23: Paper Presentations: Bandits Analysis of Thompson Sampling for the Multi-armed Bandit Problem

The Best of Both Worlds: Stochastic and Adversarial Bandits
 
Lecture 24: Paper Presentations: Probabilistic Models Disentangling Gaussians

Exact Recovery of Sparsely-Used Dictionaries
 
Lecture 25 Exam