Syllabus: Introduction to Machine Learning

Description and Objective: An investigation of programs that can dynamically adapt their behavior. The course focuses on two main general ideas: supervised learning and unsupervised learning. In supervised learning, a set of already-known correct responses to already-seen inputs is provided, and is used to train the program to make correct responses to new inputs. For instance, a facial recognition program could be supplied with a number of photographs, each labeled with the name of the person in the image, and learn to recognize new images of those persons. In unsupervised learning, a program seeks to find hitherto unknown patterns in data, without any pre-judgment about what those patterns might be. For example, a music recommendation program might try to find similarities between groups of songs, so that when a user likes one such song, others in its similar group can be recommended to them. The course looks at various computational and mathematical models and techniques that can be applied to such problems.

Objectives for the Course


By the end of the semester, a successful student will be able to do all of the following things:

  • Identify the differences between supervised, semi-supervised, and unsupervised learning techniques, along with how and when each can be employed.
  • Identify the sorts of machine learning problems (e.g. clustering, regression, etc.) that correspond to a real-world problem of interest.
  • Translate data-sets into properly formed input for a machine learning algorithm.
  • Design and implement basic machine learning algorithms, using them to solve tasks of interest. This will involve both coding algorithms from scratch, and using implementations of standard algorithms found in modern professional machine learning libraries.
  • Compare and contrast various algorithms, and variations on a single algorithm, and the results they generate on a given data set.
  • Compare and contrast methods for evaluating the success of a program engaged in some machine learning task.

Course Materials

  1. Textbook: No textbook purchase is required. We will be using a number of online resources, accessible freely by browser and/or in PDF form:
    • A Course in Machine Learning. Hal Daumé III. http://ciml.info. [link; PDF]
    • Introduction to Statistical Learning. Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani. Springer, 2013. Corrected 8th printing, 2017. [link; PDF]
    • Elements of Statistical Learning: Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2nd Edition, Springer, 2009. Corrected 12th printing, 2017. [link; PDF]
    • Deep Learning. Ian Goodfellow, Yoshua Bengio, and Aaron Courville. MIT Press, 2016. [link]
  2. Lecture notes: When appropriate, these will be made available in the notes section of the class website.

Prerequisites and Expected Competencies

Programming: Students should be able to comfortably write substantial programs (i.e., at the level expected by the end of COMP 15 or an equivalent course). For some assignments, the choice of programming language is left open, as we will be most interested in analysis of the results of those programs. For other assignments, we will use Python, a popular language for ML applications that is also beginner friendly. While no background in Python is expected, students should be prepared to learn necessary aspects of the language as they go through the exercises for which it is required.

Mathematics: Comfort with mathematical formalisms is absolutely necessary to the understanding of some core ML algorithms and models. Basic familiarity with multivariate calculus (integrals, derivatives, vector derivatives) is expected. Prior experience with linear algebra and probability theory will also be helpful.

Requirements & Grading

Grades will be based on the following:

  1. 30% Homework (6 assignments)
  2. 30% Projects (3 assignments)
  3. 20% Midterm examination
  4. 20% Final examination

Letter Grades: COMP 135 uses the following breakdown of letter grades and percentages:

   98–100%    A+    87–89%    B+    77–79%    C+    67–69%    D+
   93–97%    A    83–86%    B    73–76%    C    63–66%    D
   90–92%    A–    80–82%    B–    70–72%    C–    60–62%    D–

Homework

Homework will be assigned regularly in the course. The homework release and due dates are listed on the schedule page. In general, students will have about two weeks for an assignment; one assignment, to help those new to Python get up to speed, will have a shorter turn-around time of only one week. Homework will be submitted using the Gradescope system, information about which will be provided as necessary.

Assignments will consist of a combination of coding and written work, including the analysis of results. Work is expected to be presented in a professional style, with all work in typed-up form, with graphics and charts correctly rendered.

Regrade requests for all homeworks must be submitted within a week of the grades being released.

Projects

Projects are meant to be open-ended, simulate case studies found in "the real world", and encourage creativity.

Each project will usually be due 2–3 weeks after being handed out. Projects will generally center around a particular methodology and task and involve significant programming (with some combination of developing core methods from scratch and using existing libraries). You will be expected to consider some conceptual issues, write a program to solve the task, and evaluate your program experimentally. The main deliverable will be a short report, which will be assessed based on effort, technical sophistication, clear explanation, supporting evidence, and overall performance. Note that it is the report that is key: an implementation that is highly effective on some ML task (in terms of classifier accuracy and the like), but which is presented with little attention to explanatory detail will receive little credit, while a detailed set of experiments that clearly explain why a certain approach did not solve the task well could receive more credit. When working on projects, students should remember that completing the coding part of the task is only really the first step; once the code is complete, time must be budgeted to prepare the presentation of the results. Code will be of less interest than analysis here.

Regrade requests for all projects must be submitted within a week of the grades being released.

Exams

The exams will be in written format, during class period in mid-semester and during the University-set time for the final. (See below for dates and times.) Example exams, to show the format and type of question, will be distributed before each exam occurs. No make-up exams will be given.

Regrade requests for all exams must be submitted within a week of the grades being released.

The final exam will not be handed back. Students may review their results by scheduling an appointment with the professor.

Policy on Late and Missing Work

For late assignments, handed in within 24 hours after the time at which it was due, a reduction of 10% will occur; if handed in within 48 hours of the expected time, a reduction of 20% will occur; within 72 hours the reduction will be 40%. No credit is given for assignments submitted after that point, without a documented reason.

Students with extreme special circumstances and only with prior approval by their academic dean must meet with the professor to make other arrangements to the scheduled homework and exams. Emails regarding the situation must be initiated by the academic dean.

Class Attendance

Attending class is expected, but attendance will not be taken formally. Class time will not be taken to review things missed due to lack of attendance. If you do have to miss a class, speak to your classmates about what you missed, and try to get notes and other materials from them.

Important Dates and Times

The schedule section of the web-page will contain detailed information about weekly readings, assignments, and lecture materials. The following key dates are worth noting at the outset:

  • Wednesday, 23 October 2019:  In-class Midterm.
  • Thursday, 12 December 2019, 7:00–9:00 PM:  Final Exam.

Policy on Collaboration

I encourage you to work together on the material. This is a great way to learn, and to share ideas. However, in order to actually learn something, it is important that you complete the real work of programming and analysis on your own, unless specifically directed otherwise. It is perfectly fine for you to discuss the general approach to a problem with one another, work out how to understand an algorithm or model, and to help one another with things like getting the software we will use to work properly on your computer. However, it is not okay to copy code and other materials from anyone inside or outside of the class. While you can of course use online references to explain key concepts, and to learn programming techniques, you must not simply copy answers or code you find online, and you should cite any such materials you consulted. This is the only way to actually learn the material.

Piazza & Collaboration

When using the Piazza forum, the same sorts of considerations about collaboration are in play when posting questions and providing answers.

Questions may be posted as either private (viewable only by yourself and course staff) or public (additionally viewable by all students for the course registered on Piazza). Some issues warrant public questions and responses, such as: misconceptions or clarifications about the instructions, conceptual questions, errors in documentation, etc. Some issues are better with private posts, including: debugging questions that include extensive amounts of code, questions that reveal a portion of your solution, etc.

Please use your best judgment when selecting private vs. public. If in doubt, make it private.

Academic Misconduct

Students should read the Tufts handbook on academic integrity located on the judicial affairs website. If a student does not understand these terms or any of the material listed on this page, it is his/her responsibility to talk to the professor. A few highlights are presented to emphasize importance:

Absolute adherence to the code of conduct is demanded of the instructors, teaching assistants, and students. This means that no matter the circumstance any misconduct will be reported to Tufts University.

Inclusivity

Respect is demanded at all times throughout the course. In the classroom, not only is participation required, it is expected that everyone is treated with dignity and respect. We realize everyone comes from a different background with different experiences and abilities. Our knowledge will always be used to better everyone in the class.

Tufts University values the diversity of our students, staff, and faculty; recognizing the important contribution each student makes to our unique community. Tufts is committed to providing equal access and support to all qualified students through the provision of reasonable accommodations so that each student may fully participate in the Tufts experience. If you have a disability that requires reasonable accommodations, please contact the Student Accessibility Services office at Accessibility@tufts.edu or 617-627-4539 to make an appointment with an SAS representative to determine appropriate accommodations. Please be aware that accommodations cannot be enacted retroactively, making timeliness a critical aspect for their provision.

Tufts and the teaching staff of COMP 135 strive to create a learning environment that is welcoming to students of all backgrounds. If you feel unwelcome for any reason, please let us know so we can work to make things better. You can let us know by talking to anyone on the teaching staff. If you feel uncomfortable talking to members of the teaching staff, consider reaching out to your academic advisor, the department chair, or your dean.

Miscellany

University Policies: Tufts has a wide range of policies that apply to students, faculty, and other employees. You can find links to much of this information at the University policy page.

Feedback: Your thoughts and concerns about this course are important. You are encouraged to give feedback to the instructors and teaching assistants throughout the term. As always students will be asked to fill out a course evaluation at the middle and end of the term.