Introduction and welcome

Any computer program can compute forwards, from causes to effects. But in many modern applications, users want computers to reason backwards, from effects to causes:

From the last two examples especially, I hope you agree that a good answer should include a probability.

At present, programs that do this kind of probabilistic reasoning are written primarily by highly trained specialists using old, general-purpose programming languages like Matlab or C++. But soon, programmers will use new languages to write programs that reason probabilistically. The design and implementation of probabilistic programming languages is a problem at the leading edge of research.

This research is ramified and chaotic; new ideas are still happening. I’m pleased to be able to bring some of the intellectual ferment into the classroom; I hope it will engage your interest and your intellectual curiosity. Probabilistic programming is an exciting new model of programming and one that I hope will become influential.

Beyond the immediate pleasures of this particular research, almost all of you are soon to embark on what I hope will be long careers in computing. Computing technology changes so rapidly that anyone who has a long career can expect to spend some of it putting research results into practice. During this course I will model for you how to approach potentially practical research in a new, unproven area. When in the future, you have to consider applying research on your own, the experience will give you a leg up.

What will we learn?

The course is designed around three learning goals:

Together, these goals address a classic question that should be asked about any program of research: why is this stuff worth working on? You may also learn, to the degree possible given the current state of the art, how to use probabilistic programming languages to solve problems.

In the rest of this section of the syllabus, I describe in more detail the direction I plan for your learning to take. This direction is informed by my experience teaching these ideas once before, in 2014.

Learning priorities

My highest priority is for you to understand the two key ideas that underlie the expression of problems in probabilistic languages and the algorithms that can solve those problems:

My other priorities involve your ability to make intellectual connections between snapshots of different ideas in probabilistic programming languages.

Subject matter

Probabilistic programming languages are a young, interdisciplinary subject, and at present the intellectual story is somewhat incoherent.

To force some coherence on the subject matter, the class will emphasize the programming-language approach, and we will focus on program analysis. The machine-learning approach will be treated as a kind of “wild West” which we will study from a distance, without trying to tame it.

What will happen in the classroom?

Most class periods will be spent working on small problems or answering discussion questions. The problems and questions will usually be grounded on outside reading. Work will be undertaken at the board in groups of 4 to 5 students.

As permitted by the number of students enrolled, some class periods will be spent reviewing your solutions to small problems posed outside of class. The purpose of these problems is not to assess your performance but to enable deeper and more informed class discussion than would otherwise be possible. For this reason, I do not plan to give you grades for these problems. I also encourage you to tackle the problems in pairs or in small groups.

A few class periods will be spent reviewing programming-language fundamentals from COMP 105 as well as fundamentals of probability.

What kind of class is this? How is it different?

Most classes in computing are organized around lectures and homeworks (problem sets or programming assignments), perhaps with some laboratory experiences. This class is a seminar,1 and as such it is organized around small-group collaboration. So that you understand the distinction, I compare the two kinds of classes.

In my mind, a good lecture course has these properties:

A seminar should offer a dramatically different experience. Like a lecture course, it should offer challenging problems, but everything else should be different:

One way to view what is going on is that we have a semester-long conversation that leads us to consensus on:

Why do we teach seminars? Primarily because this is the way real science is done: through extended, vigorous conversations about problems and ideas. And when it goes well, a seminar is among the most rewarding experiences you can have in a classroom.

How will everyone be evaluated?

My evaluation of your work, and your final course grade, will be based on your class participation and on your contribution to an engaging final project. In detail, here is what I expect:

How might these expectations relate to your grades? Well, you are all experienced students and well versed in computer science, or you would not be eligible to take a seminar like this one. You probably also know how to contribute effectively in a small-group setting, and how to put together a class project. In my past experience with these kind of classes, a large majority of students have earned A’s, and almost every student has earned at least an A-minus.

Finally, here’s how I expect you to evaluate me:

What about the final project?

The ideal project will give you a chance to dig more deeply into whatever aspects of probabilistic languages attracted you to the course in the first place. Here are the ground rules:

Any one of these three outcomes should lead to a top grade on your project; nobody should aim for all three.

The most challenging aspect of the project is that you need to identify something interesting before class is even half over. It is always easier to wait another week when you know more and can make more informed decisions. But every week you wait means one fewer weeks that you can actually work on the project itself. The key date is the day of scheduled for mutual criticism of proposals in class. I have tentatively scheduled that class for October 24, but depending on how things go, we might try to have it earlier.

If in the middle of November, you feel you would benefit from some class discussion of your project in progress, please let me know—we may be able to schedule something.

Languages and systems that may be interesting for a project

Here are a number of languages and systems that are interesting potential substrates for a project. The systems that are most mature and most friendly to beginners are listed first:

A longer list, with other commentary, can be found at

What challenges should we expect?

Any small-group discussion class poses predictable challenges. In addition, a class like this one, which crosses disciplinary boundaries and is close to the leading edge of research, poses special challenges. Here are some that I know about:

What do I need to know coming in?

This course builds on probability and programming languages. Ideally you remember this material from Discrete Math (61) and from Programming Languages (105), but if you missed it, here is what you need to review:

What else can I do to succeed?

If you’re nervous about giving a technical presentation, or if you’d just like some help, the ARC offers one-on-one appointments with public-speaking consultants who will help you with a presentation. You can sign up through Tutor Finder, which is supposedly on iSIS.

One of my former students also contributed this advice:

The same former student also had these comments about his experience in class:

  1. According to the Oxford English Dictionary, a select group of advanced students associated for special study and original research under the guidance of a professor.

  2. I am much less well able to represent the consensus of machine-learning scholars.