Comp 135: Introduction to Machine Learning
Department of Computer Science
Tufts University
Fall 2017

Course Web Page: (redirects to current page)

  • (11/14) Hussaini Elkotby's office hour on Friday 11/17 is cancelled.
  • (11/10) Office hours for Roni Khardon Monday 11/13 will be held during 10-11 (instead of 10:30-11:30).
  • (10/4) Khardon's office hours on Monday 10/9 are cancelled. He will be available for questions after class on Wednesday 10/11.
  • (9/10) Office hours posted.
  • (9/3) Course information posted.
  • Previous announcements are here

What is this course about?

Machine learning is the science of collecting and analyzing data and turning it into predictions, encapsulated knowledge, or actions. There are many ways and scenarios by which data can be obtained, many different models and algorithms for data analysis, and many potential applications. In recent years machine learning has attracted attention due to successful applications in science and commerce leading to widespread use.
The course gives a broad introduction to machine learning aimed at upper level undergraduates and beginning graduate students. Some mathematical aptitude is required, but generally speaking we focus on baseline algorithms, practical aspects, and breadth and leave more sophisticated aspects as well as detailed analysis to more advanced courses: Statistical Pattern Recognition (Fall 2017), Machine Learning Seminar (Fall 2016), Computational Learning Theory (Fall 2015), Learning, Planning and Acting in Complex Environments (Fall 2014)


An overview of methods whereby computers can learn from data or experience and make decisions accordingly. Topics include supervised learning, unsupervised learning, reinforcement learning, and knowledge extraction from large databases with applications to science, engineering, and medicine.


Comp 15 and COMP/MATH 61 or consent of instructor. Comp 160 is recommended. You will also need a minimal amount of calculus.

Class Times:

(K+ Block) MW 4:30-5:45, Robinson, SEC, Room 253


Roni Khardon
Office Hours: Monday 10:30-11:30, Halligan 230.

Teaching Assistants:

Hussain Elkotby


TA Office Hours (all in Halligan 121):
Monday: 1-2pm (Elkotby),
Tuesday: 3-4pm (Elkotby), 7-8pm (Liu),
Wednesday 7-8 (Liu),
Thursday: 4-5pm (Elkotby), 7-8 (Liu),
Friday: 12-1pm (Elkotby), 3-4pm (Liu).

Course Work and Marking

The course grade will be determined by a combination of
Written homework assignments (20%)
these assignments exercise and reinforce class material.
Experimental/Programming projects (30%)
these assignments exercise and reinforce class material. Projects will include both programming assignments and use of existing machine learning software.
Rules for late submissions:
All work must be turned in on the date specified. Unless there is a last minute emergency, please notify Roni Khardon of special circumstances at least two days in advance. If you haven't finished an assignment by the due date, Please turn in the work you have done (even partial) on time , and it will be evaluated for partial credit.
In-class midterm exam (20%), Wednesday, October 18.
Final exam (30%) scheduled according to the Tufts' exam schedule for the K Block, on Thursday, 12/14, 7:00-9:00pm
Note: If your final exam grade is higher than the midterm the midterm is discounted and the final will count for 50%.


On homework assignments and projects: You may discuss the problems and general ideas about their solutions with other students, and similarly you may consult other textbooks or the web. However, you must work out the details on your own and code/write-out the solution on your own. Every such collaboration (either getting help of giving help) and every use of text or electronic sources must be clearly cited and acknowledged in the submitted homework.
On exams: no collaboration is allowed.
Failure to follow these guidelines may result in disciplinary action for all parties involved. Any questions? for this and other issues concerning academic integrity please consult the detailed guidelines and policy available from the office of the Dean of Student Affairs.

Tentative List of Topics

[We may not cover all sub-topics]

Textbooks and Material Covered

No single text covers all the material for this course at the right level. We have the following texts on reserve in the library and I will try to use first [M] and then [F] as preferred sources for reading when possible. Other material will be selected from research and survey articles or other notes. Detailed reading assignments and links to material will be posted.


Reading, References, and Assignments

Note about slides: slides are used as an aid to lecture delivery. While I provide copies of most of the slides they do not cover everything that is disussed in class, and they are not a replacement to the assigned reading.

Lecture Topic Reading/Assignments/Notes Due Date
9/6 Introduction to Machine Learning Read the introductory chapter of [M].
See also lecture slides.
Alternate reading: Read the introductory chapter of one of [WF], [F] or [A].
  Supervised Learning Basics:    
9/11 Instance based learning [M] Chapter 8.
See also lecture slides.
See also Andrew Moore's tutorial on kd-trees
See also original paper describing the Relief Method
Alternate reading: [RN] 18.8 or [DHS] 4.4-4.6.
9/13, 9/18 Decision Trees [M] Chapter 3.
See also lecture slides.
Alternate reading: [RN] 18.1-4 or [F] Chapter 5.
  Optional Reading T. Dietterich, M. Kearns, and Y. Mansour Decision Tree Learning and Boosting Applying the Weak Learning Framework to Understand and Improve C4.5. International Conference on Machine Learning, 1996.  
  Written Assignment 1 Assignment 1 9/25
  Empirical/Programming Assignment 1 Project 1 and corresponding Data 9/27
9/20 Probability Basics Lecture provides a basic and brief introduction to probability theory and working with random variables.
Please review relevant material from your discrete math, algorithms, or probability and statistics course.
9/25, 9/27 Maximum Likelihood Estimation and the Naive Bayes Algorithm [M] 6.1-6.2, and 6.9-6.10.
See also new book chapter from [M]
See also lecture slides.
Alternative reading: [DHS] Section 2.9; [F] 9.2; [WF] 4.2.
9/27, 10/2 Evaluating Machine Learning Outcomes [M] Ch 5.
See also lecture slides.
Alternative reading: [F] Ch 12
  Additional Optional Reading Foster Provost, Tom Fawcett, Ron Kohavi The Case Against Accuracy Estimation for Comparing Induction Algorithms Proc. 15th International Conf. on Machine Learning, 1998.
T. Dietterich, Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms Neural Computation 10(7), 1998.
Stephen Salzberg On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach Data Mining and Knowledge Discovery, 1997.
10/4 Features (selection, transformation, discretization) None of our sources is a perfect match for this lecture.
Relevant reading includes some portions of [F] Chapter 10, and [A] Chapter 6
See also lecture slides.
  Additional Optional Reading Wrappers for Feature Subset Selection Ron Kohavi, George H. John Artificial Intelligence, 1996. (Read till section 3.2 inclusive.)
Supervised and unsupervised discretization of continuous features. James Dougherty, Ron Kohavi, and Mehran Sahami. International Conference on Machine Learning, 1995.
  Written Assignment 2 Assignment 2 10/16
  Empirical/Programming Assignment 2 Project 2 and corresponding Data 10/16
10/11; 10/16 Linear Threshold Units [M] 4.1-4.4
See also new book chapter from [M]
See also lecture slides.
Alternative reading: [DHS] 5.5;
  Written Assignment 3 Assignment 3 10/23
Wednesday 10/18 Midterm Exam Material for the exam includes everything covered up to 10/11 (that is, excluding clustering and, in the Linear Threshold Units section, material up to the Perceptron algorithm). Everything discussed in class and homework assignments is included for the exam.
Lecture slides are not comprehensive and I expect you to read the assigned materials which should be useful in review and study. But I will not hold you responsible for technical details in the reading that were not discussed in class or assignments.
The Exam is closed book; no notes or books are allowed; no calculators or other machines of any sort are allowed.
The exam will aim to test whether you have grasped the main concepts, problems, ideas, and algorithms that we have covered, including the intuition behind these. Generally speaking, the exam will not test your technical wizardry with overly long equations or calculations, but, on the other hand, it is sure to include some shorter ones.
10/23 Clustering [F] 8.4-5
See also lecture slides.
Alternative reading: [DHS] 10.6-7,10.9.
  Empirical/Programming Assignment 3 Project 3 and corresponding Data 11/6
10/25 Unsupervised and Semi-Supervised Learning with EM [M] Section 6.12
Text Classification Using Labeled and Unlabeled Documents using EM Nigam et. al, Machine Learning Volume 39, pages 103-134, 2000. (The entire paper is relevant; you can skip section 5.3)
See also lecture slides .
Alternative reading: [A] 7.4; [F] 9.4; [DHS] 3.9
10/30 Association Rules [F] 6.3
Mining Association Rules between Sets of Items in Large Databases Rakesh Agrawal, Tomasz Imielinski, Arun Swami Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, 1993.
See also lecture slides.
Alternative reading: [WF] 4.5
  Optional Reading Real World Performance of Association Rule Algorithms Zheng et al, KDD 2001.
Mining the Most Interesting Rules Bayardo et all, KDD 1999.
Dynamic Itemset Counting and Implication Rules for Market Basket Data Brin et al, SIGMOD 1997.
Discovering All Most Specific Sentences Gunopulos et al, TODS, 2003.
  Written Assignment 4 Assignment 4 11/6
11/1, 11/6 Neural Networks [M] Chapter 4.
See also lecture slides.
Alternative reading: [RN] 18.7, [DHS] 6.1-5.
  Optional Reading Going Deeper with Convolutions
Deep Residual Learning for Image Recognition
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
  Empirical/Programming Assignment 4 Project 4 and corresponding Data 12/4
11/8; 11/13 Kernels, Dual Perceptron, Support Vector Machines [F] 7.3.
A practical guide to support vector classification C.-W. Hsu, C.-C. Chang, C.-J. Lin. Technical report, Department of Computer Science, National Taiwan University. July, 2003.
See also lecture slides.
Alternative reading: [CST] pages: 9-19 and 26-32, [RN] 18.9,
  Written Assignment 5 Assignment 5 12/4
11/15 Active Learning Active Learning Literature Survey
See also lecture slides.
  Optional Reading The Robot Scientist Adam IEEE Computer (Volume:42, Issue: 8) 2009.
Support Vector Machine Active Learning with Applications to Text Classification Simon Tong, Daphne Koller; JMLR 2(Nov):45-66, 2001.
11/21 Aggregation Methods [F] Chapter 11.
See also lecture slides.
Alternative reading: [A] Chapter 17
  Optional Reading Explaining AdaBoost In: Empirical Inference: Festschrift in Honor of Vladimir N. Vapnik, Springer, 2013.
An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Dietterich, T. Machine Learning, 40 (2) 139-158, 2000.
Useful information about Random Forests
Boosting the margin: A new explanation for the effectiveness of voting methods. Robert E. Schapire, Yoav Freund, Peter Bartlett and Wee Sun Lee. The Annals of Statistics, 26(5):1651-1686, 1998. (source of margin graphs in slides; the introduction is informative and accessible)
Improved boosting algorithms using confidence-rated predictions Robert E. Schapire and Yoram Singer Machine Learning, 37, 1999. (source of confidence rated Aadaboost version in slides)
11/27; 11/29 Computational learning theory [M] 7.1, 7.2, 7.3, 7.5, and (for perceptron) [DHS] 5.5.2
Topics covered: on-line learning, the Perceptron convergence theorem, weighted majority, PAC learning, Agnostic PAC learning.
Alternative reading: [RN] 18.5 and (for perceptron) [CST] 2.1.1