Time: MW 1:30–2:45 Place: Halligan 102 Questions to: Piazza Instructor: Norman Ramsey, Halligan 222

# Quick Reference: Syllabus, Schedule, photos

At this point the main resources for the class are the syllabus and schedule. I am hoping to write software that will make it easier to get to the readings.

# Background

In the real world, events are uncertain. To reason about uncertain events, we compute with probability (and also statistics). Over the past 20 years, people who write probabilistic computations have increasingly designed, built, and used special-purpose, probabilistic programming languages. The popularity of this approach can be seen in the long list of such languages used in industry and in research: Anglican, AutoBayes, Alchemy, Blaise, BLOG, BUGS, Church, Csoft, FACTORIE, Figaro, Fun, Hakaru, HANSEI, HBC, IBAL, Lambda-circle, Probabilistic cc, PFP, Probabilistic Scheme, and Wolfe.

To give you the flavor of the kind of problem these languages are intended to solve, here is a sketch of a probabilistic program:

• There is a 30% chance of rain
• There is a 50% chance of irrigation
• If it rains, there is a 90% chance the grass will be wet
• If irrigation is turned on, there is an 80% chance the grass will be wet
• There is a 10% chance that the grass will be wet from an unknown cause

It is easy to write an interpreter that can run this program forward and answer the question “what is the probability that the grass is wet.” This is not interesting. The interesting thing to do is to take this program and ask, “given that I see the grass is wet, what is the probability that it rained?” (Answer: about 47%). The interesting computations involve inference, which reasons backwards from effects to causes:

• Given the words in this message, what is the probability that it is spam?

• Given the bits in this scanned image, what is the probability that it is the cover page of bkolta01’s final exam?

• Given a set of sonar echoes, what is the most likely location of my robot?

• Given a sequence of French words, what is the most likely English translation?

• Given a set of tournament results, what is the most likely ranking of the tournament players?

# Course goals

Most of the languages listed above have been designed and implemented by people whose primary expertise is machine learning, not programming languages. The primary goals of the course are:

• To understand, in a way that you can relate to your experience in computing, the problem that the designers of probabilistic languages are trying to solve.

• To see that parts of the problem yield to a combination of 19th-century mathematics and 21st-century program analysis.

• To see how, despite the satisfying mathematical underpinnings, the existing solutions to the problem are diverse and are, at least to a degree, unsatisfying.

We will also learn something about how probabilistic programs can be analyzed.

# Course details

The pace of the course will be relatively slow. The course will be designed to help students achieve one or two goals in depth, not to give a breathless survey of the entire field.

Class will meet twice a week and will proceed in two phases:

1. Introduction to probabilistic languages from the programming-language perspective. This phase will include review of relevant material from COMP 105 (Programming Languages) and of the fundamentals of probability. It will then proceed to reading and discussion of primary sources about probabilistic programming languages.

2. In the second phase, we will emphasize relatively technical papers from the programming-language literature, occasionally leavened by relatively applied papers from the machine-learning literature.

Each student is expected to undertake one project for the semester. The project may be undertaken alone or in teams of any size. Any project related to probabilistic programming language is acceptable. Ideas might include

• Building a proof-of-concept application around a probabilistic program

• Building a proof-of-concept implementation of a probabilistic language

• Developing a type system or program analysis that can be applied to probabilistic programs

• Developing formal semantics for probabilistic languages— possibilities might include an extension of an existing semantics, a new semantics for an existing language, or a new language design with formal semantics

Grades will be based on class participation and on a contribution to an engaging project. There will be no examinations. To earn credit for class participation, students will be expected to engage in class discussion about readings and problems. Students will also solve small programming problems or probability problems outside of class and to come to class prepared to share solutions. Such problems will count toward class participation but will not otherwise be graded.

Prerequisite: COMP 105 (Programming Languages).

# Languages and systems to consider for the project

Here are a number of languages and systems that are interesting potential substrates for a project. The systems that are most mature and most friendly to beginners are listed first:

• WebPPL is the implementation language of the electronic book The Design and Implementation of Probabilistic Programming Languages.

• Church is Scheme-like language with probabilistic semantics. I have my issues with the computational model, but it is the foundation of many other projects. An extensive tutorial can be found in the electronic book Probabilistic Models of Cognition.

• Anglican is inspired by Church and integrated with Clojure. It compiles to relatively efficient code for the Java Virtual Machine. There is a highly polished tutorial.

• Hakaru is a probabilistic language embedded in Haskell. Its distinctive feature is that it can do exact inference by symbolic analysis. We will read about this inference in class.

• Probabilistic C is intended as a compilation target for higher-level probabilistic languages. It is surprising how much can be accomplished by adding just a couple of primitives to C—and by exploiting parallelism.

• Wolfe is a probabilistic language embedded in Scala. It has a particularly clean way of talking about probabilistic domains.

A longer list, with other commentary, can be found at `probabilistic-programming.org`.