Syllabus


CS 152 L3D: Learning from Limited Labeled Data
Department of Computer Science, Tufts University
Class Meetings for Fall 2024: Tue and Thu 1:30-2:45 pm ET in JCC 265
Instructor: Mike Hughes, Assistant Professor of Computer Science
Grad TA: Panos Lymperopoulos, Ph.D. student, Tufts CS.
Have questions? Need help?
  • Get rapid online help via our Piazza discussion forum
  • Get in-person help at regular Office Hours
  • For personal issues, make a private post on Piazza (visible to Instructors and you only)

Quick Links: [Prereqs] [Wait List] [Deliverables] [Late Work Policy] [Collaboration Policy] [Grading Rubric]

Course Overview

This course will study machine learning methods that can learn from available labeled datasets of limited size to perform a desired task well by leveraging either some other abundant source of related data or a pre-trained model. Topics will include self-supervised learning, semi-supervised learning, transfer learning, and more. The goal of this course is to bring students to the forefront of knowledge in this area. Students will engage in discussions of recent literature and complete a semester-long team research project designing, implementing, evaluating, and communicating new contributions to this research area.

Objectives

After completing this course, students will be able to:

  • read a new published paper within the field and identify its contributions, strengths, and limitations
  • implement a method in Python given a research-paper-level description
  • suggest hypothesis-driven experiments to evaluate research ideas

Prerequisites

This course intends to bring students near the current state-of-the-art. An ambitious final project could represent a viable submission to a workshop at a major machine learning conference such as NeurIPS or ICML

To achieve this objective, we expect students to be familiar with:

  • Core machine learning knowledge
    • Underfitting vs. overfitting, bias-variance tradeoff
    • Regularization techniques: L2 penalties, early stopping, dropout
    • Neural networks: Basic architectures (MLP, ConvNet, ResNet), basic loss functions (cross entropy)
    • Stochastic gradient descent for training neural networks
  • Core mathematics capabilities
    • Basic linear algebra (matrix multiplication, matrix rank)
    • Basic multivariate calculus: partial derivatives of vector-input, single-output functions
    • Basic probability and statistical learning: probability mass function, probability density function, maximum likelihood estimation
  • Coding in Python with modern open-source data science libraries, especially deep learning libraries.
    • Basic array operations in numpy (computing inner products, inverting matrices, etc.)
    • Training basic classifiers (like LogisticRegression) in scikit-learn
    • Training deep neural networks via PyTorch or similar frameworks with automatic differentiation (e.g. Tensorflow, JAX, etc.)

Practically, at Tufts this means having successfully completed at least one of the following, and ideally both of

  • CS 135 (Introduction to Machine Learning)
  • CS 137 (Deep Neural Networks)

With instructor permission, diligent students who are lacking in a few of these areas may with significant investment of their own time be able to catch-up on core concepts via self study. Please see the community-sourced Prereq. Catchup Resources Page for a list of potentially useful resources for self-study.

Enrolling and Wait Lists

As of the start of semester, we expect to have about 40 students enrolled in the course. We are currently at capacity, but some students may drop the course and leave openings for others (usually we see 5-10 openings in the first week of classes as schedules shift).

Our top priority is to provide each enrolled student with our full support, including the ability to get prompt answers to questions on Piazza and in office hours as well as the ability to get high-quality feedback on submitted homeworks and projects in a timely manner.

Thus, we are not anticipating expanding the available seats. Anyone not formally enrolled on SIS as of the first class should contact your instructor (Mike Hughes) asap. He will make final decisions by the Friday of the first week of classes.

Class Format for Fall 2024

This course will be offered to prioritize in-person settings and the rich interactions and discussions that in-person classes afford.

We will have a zoom for most classes. Students with valid reasons for missing class that are articulated in advance will be allowed to join on zoom. Some recordings of lecture may be made available if staff bandwidth allows, but this should not be relied upon.

We will not make zoom options available for in-class exercises or discussions. No presentations or class leadership of discussions will be allowed over zoom. Rare exceptions will be made at instructor discretion, only for students with well-justified reasons for missing class articulated in advance.

Coursework and Deliverables

There are four primary tasks for students throughout the course, listed below with its relative weight for grading.

  • 20% : Complete 2 homework assignments
    • Short PDF reports will be turned in via Gradescope
  • 6% : Participate in class discussions and exercises
    • 3% : Complete 4 in-class pop-up exercises in select classes throughout the semester (chosen randomly by instructor without advanced warning)
  • 10% : Lead one in-class discussion
    • Teams of students will be assigned to each class from mid October onward
    • Select a relevant paper by chosen due date for paper selection
    • Meet with instructor during office hours well in advance of assigned class to discuss strategy
    • Prepare a 20 minute summary presentation to share with the class
  • 64% : Team research project
    • 6% Initial pitch presentation
    • 6% Checkpoint 1 report
    • 8% Checkpoint 2 report

Throughout, our evaluation will focus on your process. We wish to train you to thinking scientifically about problems, think critically about strengths and limitations of published methods, propose good hypotheses, and confirm or refute theories with well-designed experiments. It doesn't matter too much if your proposed idea works or doesn't work in the end, just that you understand why.

Late work Policy

We want students to develop the skills of planning ahead and delivering work on time. To facilitate learning, we also want to be able to release solutions quickly and discuss recent assignments soon after deadlines. On the other hand, we know that this semester offers particular challenges, and we wish to be flexible and accommodating within reason.

With these goals in mind, we have the following policy:

For projects, any work turned in late will be accepted at the discretion of the instructor and subject to a deduction that grows the longer the work is late.

For homeworks only, each student will have 144 total late hours (= 6 late days) to use throughout the semester across all homeworks.

For each individual assignment, you can submit beyond the posted deadline at most 96 hours (4 days) and still receive full credit. Thus, for one assignment due at Thu 11:59pm ET, you could submit by Mon at 11:59pm ET.

This late work deadline is key to our classroom goals. It allows us to:

  • always release homework solutions on Tue mornings
  • discuss the assignment in class on Tue
  • be sure we can return all graded work promptly

The timestamp recorded on Gradescope will be official. Late time is rounded up to the nearest hour. For example, if the assignment is due at 3pm and you turn it in at 3:05pm, you have used one whole hour.

Beyond your allowance of 6 late days (and beyond the 4-day limit per assignment), zero credit will be awarded except in cases of truly unforeseen and exceptional circumstances (e.g. family emergency, medical emergency). Students with exceptional circumstances should contact the instructor to make other arrangements as soon as possible.

Collaboration Policy

For homeworks: we encourage you to work actively with other students, but you must be an active participant (asking questions, contributing ideas) and you should write your solutions document alone. At the top of your writeup, you must include the names of any people you worked with, and in what way you worked them (discussed ideas, debugged math, team coding). We may occasionally check in with groups to ascertain that everyone in the group was participating in accordance with this policy.

For final projects and in-class presentations: you will work in teams. Each team should submit one report at each checkpoint and will give one presentation. Each member of the team is expected to actively participate in every stage of the project (ideation, math, coding, writing, etc.). Please write all names at the top of every report, with brief notes about how work was divided among team members. Larger teams will be expected to produce more interesting content.

Grading

Grades for each deliverable will be computed as a numerical score from 0.0 to 1.0, and final grade will be computed by a numerical average of these item-based scores following the relative weighting in the deliverable listing above.

When assigning grades given a final numerical score (from 0.0 to 1.0), the following scale will be used:

  • 0.94-1.00 : A
  • 0.90-0.94 : A-
  • 0.87-0.90 : B+
  • 0.83-0.87 : B
  • 0.80-0.83 : B-
  • 0.77-0.80 : C+
  • 0.73-0.77 : C
  • 0.70-0.73 : C-
  • 0.67-0.70 : D
  • 0.63-0.67 : D
  • 0.60-0.63 : D-
  • Below 0.60: F

We do not round up grades. This means you must earn at least an 0.83 (not 0.825 or 0.8295 or 0.8299) to earn a B.

The highest possible grade of "A+" will be awarded at the instructor's discretion.

Academic Integrity Policy

This course will strictly follow the Academic Integrity Policy of Tufts University. Students are expected to finish course work independently when instructed, and to acknowledge all collaborators appropriately when group work is allowed. Submitted work should truthfully represent the time and effort applied.

Please refer to the Academic Integrity Policy at the following URL: https://students.tufts.edu/student-affairs/student-life-policies/academic-integrity-policy

Accessibility

Tufts and your instructor strive to create a learning environment that is welcoming students of all backgrounds.

Please see the detailed accessibility policy at the following URL: https://students.tufts.edu/student-accessibility-services