Comp11
Grammar for C++

Overview

In class today we developed a grammar for the parts of the C++ programming language that we've talked about so far. The grammar gives us a clear way to describe what is allowed in the language (and what is not allowed.)

The way the grammar works is that it breaks a program down into syntactic categories, which describe what composes different parts of a program. We can use the same idea to describe a grammar for English: for example, a sentence is a syntactic category composed of other (smaller) categories, such as nouns, verbs, and adjectives. The grammar describes what is in these categories, and how to construct the larger structures from the smaller ones.

Another way you can use the grammar is to "diagram" a program -- just like diagramming a sentence in elementary school. You should be able to circle each part of your program and identify the syntactic category -- it is a statement? an expression? a declaration? a definition? -- and explain why it belongs where it is in the program.

The notation for grammar rules uses several special operators (colored in blue):

:= defines a rule and can be read as "is made up of".

| (vertical bar) separates choices in a rule.

[ ] (square brackets) enclose parts of a rule that are optional.

{ } (curly braces) enclose parts of a rule that may be repeated zero or more times.

Note that this notation is not part of C++; it is a notation for defining grammars (of any kind) called Backus-Naur form. In the grammar below, the parts in brown are grammatical categories (like noun or verb in English) and the parts in black are called literals -- things you can actually write (like dog or run in English).

Program

The top-most grammatical rule defines what composes a program, at the highest level. Our programs consist of a #include, followed by zero or more struct definitions, zero or more function declarations, and zero or more function definitions. Here is how we write that in the grammar:

program := #include "comp11io.h"
{ structdef }
{ functiondecl }
{ functiondef }

Structure definitions

New: struct definitions introduce new types, so we generally place them at the top of the C++ source file. The "body" of the struct consists of a set of fields or members each of which has a type and a name. The name that follows the struct keyword will be the name of the new type.

structudef := struct name {
    { type name ; }
}

Function declarations and definitions

Both function declarations and function definitions have a head, so we can make a separate rule for that and use it twice. Notice the parameters rule: it says "the parameters are made up of one parameter, followed by zero or more occurences of a comma followed by another parameter." (A name is any word starting with a letter followed by some number of letters and digits.)

head := type name( [ parameters ] )
parameters := parameter { , parameter }
parameter := type name
type :=   int
| double
| bool
| char
| string
| name          // Note: must be a struct type name

New: Notice that since we can define new types using structdef, the rule for types now includes the option for a name, which would have to be the name of a struct type.

Compare some examples of what is allowed and not allowed by these rules: