Syllabus: This semester the course will focus on Computational Learning Theory. Machine learning algorithms can be used analyze available data and develop generalizations that are useful for handling future data or experience. Computational Learning Theory studies the computational complexity, data complexity, and convergence properties (bounds on error rates) of machine learning algorithms. The emphasis is on algorithms that are both efficient and have good convergence properties. Alternatively, for some machine learning problems we seek lower bounds on the amount of resources for any potential algorithm. Models vary from adversarial worst case scenarios to statistical settings where a random process generates the data. The course reviews classical results in this field and samples from recent developments.
Prerequisites: COMP 160 or similar background. Some knowledge of probability theory is necessary (e.g. from MATH 161). COMP 170 is also helpful but not required. The course requires an aptitude for mathematical analysis, writing proofs, etc.
Times and Location: Tuesday/Thursday 10:30-11:45 Halligan Hall 106
| Type | What | Due Date |
| Lecture 1 | Introduction to CLT |   |
| Reading | Sections 1.1-1.3 of K&V |   |
| Lecture 2 | PAC Learning Model |   |
| Lecture 3 | PAC Learning Model; Tools for Probabilistic Analysis |   |
| Reading | Chapter 9 of K&V |   |
| Lecture 4 | Using Chernoff Bounds |   |
| Lecture 5 | Variations of PAC learning; Proof of Hoeffding's Bound |   |
| Reading | Section 4.2 of K&V |   |
| Lecture 6 | Computational Hardness for PAC learning |   |
| Reading | Sections 1.4-1.5 of K&V |   |
| Assignment 1 | Assignment 1 | Feb 19 |
| Lecture 7 | Hardness continued; Occam's Razor we also used companion slides |   |
| Reading | Chapter 2 (2.1, 2.2, 2.4) of K&V |   |
| Lecture 8 | The VC dimension; Compression schemes |   |
| Reading | Chapter 3 (3.1-3.3) of K&V |   |
| Lecture 9 | Model Selection and Structural Risk Minimization |   |
| Lecture 10 | Reductions for PAC Learning |   |
| Lecture 11: Paper Presentations |
Improved boosting algorithms using confidence-rated predictions
Robert E. Schapire and Yoram Singer
Machine Learning, 37, 1999.
A decision-theoretic generalization of on-line learning and an application to boosting. Yoav Freund and Robert E. Schapire. Journal of Computer and System Sciences, 55(1):119-139, 1997. New Results on Hardness of Proper Learning Misha Alekhnovich, Mark Braverman, Vitaly Feldman, Adam Klivans, Toniann Pitassi. Proceedings of the 45th Foundations of Computer Science (FOCS), 2004. |
  |
| Reading | Section 4.3.1-4.3.2 of K&V |   |
| Lecture 12: Paper Presentations |
Constraint-based Optimization and Utility Elicitation using the Minimax
Decision Criterion.
Craig Boutilier, Relu Patrascu, Pascal Poupart, and Dale Schuurmans.
Artificial Intelligence 170(8--9), pp.686--713 (2006).
Minimax Regret-based Elicitation of Generalized Additive Utilities Darius Braziunas and Craig Boutilier. Proceedings of the Twenty-third Conference on Uncertainty in Artificial Intelligence (UAI-07), Vancouver, to appear (2007). Preference Elicitation and Query Learning. Avrim Blum, Jeffrey Jackson, Tuomas Sandholm, Martin Zinkevich; 5(Jun):649--667, 2004. |
  |
| Assignment 2 | Assignment 2 | Mar 13 |
| Lecture 13 | Learning with Queries |   |
| Reading | Sections 8.1-8.2 of K&V |   |
| Lecture 14 | Learning Horn Expressions with Queries |   |
| Lecture 15 | Mistake Bounded Learning and the Perceptron Algorithm |   |
| Reading | Section 2.1 of C&ST |   |
| Lecture 16 | The Winnow Algorithm |   |
| Assignment 3 | Assignment 3 | Apr 3 |
| Lecture 17 | DNF learning and Kernel Methods |   |
| Lecture 18 | Kernels and Their Properties |   |
| Reading | Sections 3.1-3.3 of C&ST |   |
| Lecture 19 | Convergence based on VC Dimension we also used companion slides |   |
| Reading | Section 3.5 of K&V |   |
| Reading | Section 4.2 of C&ST |   |
| Lecture 20/21 | Convergence based on Margin we also used companion slides |   |
| Reading | Section 4.3 of C&ST |   |
| Lecture 21/22 | Deriving Support Vector Machines: optimizing functions under contraints |   |
| Assignment 4 | Assignment 4 |   |
| Lecture 23 | Algorithms for solving the SVM Optimization |   |
| Additional Reading for SVM algorithms |
Support Vector Machine Solvers.
L. Bottou and C.-J. Lin.
In Large
Scale Kernel Machines, Léon Bottou, Olivier Chapelle, Dennis DeCoste,
and Jason Weston editors, 1-28, MIT Press, Cambridge, MA., 2007
Fast Training of Support Vector Machines using Sequential Minimal Optimization. by J.C. Platt, Advances in Kernel Methods - Support Vector Learning, B. Schölkopf, C. Burges, and A. Smola, eds., pp. 185-208, MIT Press, (1999). Making Large-Scale SVM Learning Practical. T. Joachims, In: Advances in Kernel Methods - Support Vector Learning, B. Schölkopf, C. Burges, and A. Smola (ed.), MIT Press, 1999. |
  |
| Lecture 24/25: Paper Presentations |
Stability of k -Means Clustering
Shai Ben-David, Dávid Pál and Hans Ulrich Simon
COLT 2007 ,
pages 20-34
Margin Based Active Learning Maria-Florina Balcan, Andrei Broder and Tong Zhang COLT 2007 , pages 35-50 Generalized SMO-Style Decomposition Algorithms Nikolas List COLT 2007 , pages 365-377 Online Learning with Prior Knowledge Elad Hazan and Nimrod Megiddo COLT 2007 , pages 499-513 Online Learning Meets Optimization in the Dual Shai Shalev-Shwartz and Yoram Singer COLT 2006 , pages 423-437 Learning Bounds for Support Vector Machines with Learned Kernels Nathan Srebro and Shai Ben-David COLT 2006 , pages 169-183 |
  |