# Pointers

### Review

• What is a variable?
• A variable has a value (contents) and an address.
• Simple types are passed to a function by value.
• Arrays are passed by reference.
• `struct`s are passed by value.
• Passing a large `struct` can be expensive (in space and time). Is there a way to pass a structure to a function more efficiently?
• Sometimes we want a function to produce more than one thing. It can return a struct, but, if we could pass other variables by reference, then the function could modify its caller's variables. Can we do that?

Today: Pointers

• What are they?
• How can I get one?
• How can I store one?
• How can I use one?

### Pointers in a nutshell

A pointer is a memory address.

They often look like this `0x7fff3889b4b4` or `0x60132`, or `0x602010` (some examples taken from a program I just ran). These addresses are expressed as numbers in hexadecimal, which is base 16 (that's what the `0x` at the beginning means). You don't have to know what the numbers are, but you do need to know that:

• Every location in memory, and therefore every variable, has an address.
• Every address corresponds to a unique location in memory.
• The computer knows the address of every variable in your program.
• Given a memory address, the computer can find out what value is stored at that location.
• While addresses are just numbers, C++ treats them as a separate type. This allows the compiler to catch cases where you accidentally assign a pointer to a numeric variable and vice versa (which is almost always an error).
• Compile and run `address_contents.cpp`. (`&` is the address of operator in C++).
• What does `&main` produce? Explain.

### Example Problem: Sorting things that are too big to move

Suppose you had a bunch of large objects. E.g, stones weighing between 1 and 5 tons each of various sizes. Appartments in a building. Buildings on a street.

You want to order the stones by size to be used in the construction of a bridge/pyramid/something. The appartments should be sorted by the length of time they've been vacant, or the time since they were last painted. The buildings should be sorted by excise tax value, square footage, number of residents.

How would you do that? Would you rearrange the buildings on the street? No.

You'd give each item a name, a reference, and you'd sort those.

Exercise:
Draw a picture of buildings on a street and write a bunch of random values for the their square footage.

Choose a representation for references to the buildings. Make an array and sort them. How do you use the references during the sorting procedure?

### Indirection in arrays

Imagine an array of large structures. Maybe a struct in this case represents all the information about a student at Tufts. Can you use the technique above?

Draw out the array and a parallel array of …

### Pointers in C++

C and C++ allow you to manipulate pointers explicitly.
• The `&` operator, called the address of operator, can be applied to a variable to find out its address.
• `&x` is the address of the variable `x`.
• `&a[i]` is the address of the variable that is the element of array `a` at index `i`.
• `&fred.age` is the address of the `age` field in the `struct` stored in the variable `fred`.

• You can store the address of a variable in another variable. You declare a variable that can hold a pointer to an integer like this:
```int *np;
```
though it's good practice to do it like this:
```int *np = NULL;
```

We'll explain the `*` notation below.

`np` is a pointer variable: it can hold the address of another variable!

```int n = 16;
...
np = &n;
```
Now `np` contains the address of `n`. We say that `np` points to `n`.

• The `*` operator, called the dereference operator gives you the value at the address it's applied to. If `i` is an integer variable whose value is `14`, then `&i` is the address of `i`, and `*(&i)` is `14`.

Discussion and questions.

Exercise:
Read and understand `pointer_example1.cpp`. Run it!

Warning:
There is considerable confusion out there about the `*` symbol. It is not part of the type, though people often pronounce it, and even worse, write it that way. A declaration has a type followed by variables in sample expressions that would produce a value of that type.
```int  *p, x;  // declares a pointer to an int and an int
int*  q, y;  // SAME!!  EVIL EVIL EVIL — NEVER WRITE THIS!!
```
Some will call the second example above a difference in style — it isn't. It's an abomination! I have seen experienced programmers waste hours because of it. The `*` goes with the variable, not the type, so write it there! (My theory is that it's a habit picked up when the person didn't understand what was going on, and when/if they did figure it out, they continued it out of a perverse notion of style and as a way to haze newcomers.)

### Warnings/Common Bugs

• Declaring a pointer variable allocates space to hold a pointer, i.e., an address. It does not allocate space to hold the data pointed to, nor does it initialize the pointer to anything in particular. For example:
```char *pc;
```
allocates space to hold a pointer to a character, but does not allocate space for any characters. Thus,
```*pc = 'q';
```
is an error unless `pc` has been give the address of an actual character variable.

For this reason, it is a very good practice to initialize pointer values to the null pointer like this:

```char *cp = NULL;
```
`NULL` is defined in `iostream` and some other common libraries. It represents a value (which can be written as `0`, though it need not actually have the value 0), which is not a valid pointer. Attempts to access the null pointer will usually crash your program, which is better than perhaps modifying random memory locations.

Exercise:
Explore `pointer_example4.cpp`. Draw a picture of what's going on.
Explore/debug `pointer_example2.cpp`. A picture might be helpful here, too.

• Aliasing. When two expressions refer to the same location, we say they are aliases. When you make a pointer to a variable, dereferencing the pointer is the same as accessing the original variable, and the value in the variable can be changed either way. When you have aliases, you can see a variable's value change even though that variable has not been used at all (it was changed via an alias).
• You can creat a pointer to a local variable, and you can return pointers, but you must never return a pointer to a local variable. Local variables are destroyed when a function returns, and therefore, a pointer to a local variable is meaningless and can have very random effects on the behavior of your program. We'll see below how to return a pointer that does have meaning.

### A fun video

Play this AVI video. It uses evil declarations, but it's otherwise fun.

### Summary

• A pointer variable is a variable that stores the address of another variable.
• Pointees are the variables to which pointers point.
• Assignment allows multiple pointers to point to the same variable (can make aliases).

### How to pass a structure or object cheaply (by reference)

• Declare the function so it takes a pointer to the structure:
```void print_course(course *pc);
```
• Call the function by passing a pointer to the structure, rather than the structure itself:
```        course my_course;
...
// code that fills in my_course
...
print_course(&my_course);
```
• To refer to a field named `courseNum` in a structure to which you have a pointer named `pc`:
• `(*pc).courseNum`
• `pc->courseNum`
These mean the same thing. Use the second one.

Now you know exactly what C++ does with arrays!

Of course the same holds for classes, because classes are structs. So, you can pass an instance of a class by reference just as you we passed an instance of a struct by reference above. Change the word `struct` to `class`, and the example is exactly the same!

Whenever you are are exectuting a member function of a class, there is always a pointer to the instance from which the function was selected available. Its name is `this`. For example, inside `Kangaroo::put_in_pouch()`, there is a variable `this` that is a pointer to a `Kangaroo`, as if it had been declared
```Kangaroo *this;
```
In fact, `this` is really an unwritten parameter of every member function. When `put_in_pouch()` refers to `pouch_contents`, it's really referring to `this->pouch_contents`, and you can refer to it using that same notation if you like. This leads to a common idiom in constructors in which parameters that are used to initialize member variables have the same names as the variables:
```MyClass::MyClass(int a, double b, Kitten k)
{
this->a = a;
this->b = b;
this->k = k;
}
```
Of course, you would never pick such awful parameter and data member names!

### Simple call-by-reference variables

Call by reference can be useful for ordinary variables, too. Usually, one does this when a function would like to return more than one value. In C/C++, one does this by returning a struct, which can seem like a lot of trouble for a small special case, or by using call-by-reference parameters. C++ has two mechanisms for doing this. Here's how you do it with pointers.

The code above uses a `read_string()` function to get a string, but it also returns true if it succeeded and false otherwise. We could have defined a struct that could hold a boolean and a string, but, in this case, we returned the success/failure indication using normal function return, and returned the string we read (if we read one) by placing it directly in the caller's variable.

A function cannot access another function's variables directly by name. However, a function may pass the address of one of its variables as a pointer parameter. Inside `read_string()`, `s` is a pointer to the caller's string variable, in this case it's the address of `main()`'s variable `word`. Thus, `*s` in `read_string()` is the same variable (the same location in memory) as `word` in `main()`. Therefore, assigning to `*s` is the same as an assignment to `word`.

### Dynamically Allocated Variables/Arrays

Problem: How can we (portably) write a program that makes an array of a size that is determined at run time? How can we make that array survive beyond the completion of the function that created it? Maybe the size is read in from the keyboard or gotten from a web form; maybe you need to do an involved computation to figure out how much space you need, for example, in a weather simulation or in the graphics for a video game.

First, creating and destroying variables dynamically:

• `new` creates a newly allocated variable and gives you a pointer to the variable. Such a variable is a dynamic variable. The variable has no name, so it's often called an anonymous variable.
```double *dp = new double;
```

• Dynamic variable live until you destroy them. (Compare to static and automatic variables.)
• Dynamic variables can be destroyed using `delete` when you don't need the space any more.
```delete dp;
dp = NULL;
```
You should destroy variables you don't need any more, and you should set such pointers to `NULL`, because using the pointer after you delete is an error that may cause random things to happen. Beware of aliases of the variable!
See `new1.cpp`.

Creating and destroying arrays dynamcially:

• Here's how you make an array using `new`:
```double *darray = new double[compute_array_size()];
```
`compute_array_size()` is just there to indicate that you can put any integer expression you like in the square brackets: a constant, an integer variable, the result of a calculation, a function call. Really, any integer expression.
• You destroy an array like this:
```delete [] darray;
```
Note the square brackets. If you leave them out, the C++ standard says anything can happen. So don't leave them out!

Examples:

### Putting it all together for dynamic arrays

Suppose you have a 1-drawer file cabinet in your room for all your course notes, financial documents, etc. Eventually, you take lots of courses and it fills up. What do you do?

We're going to make a growable array abstraction that works the same way. It will start out small and then grow as needed.

• A flexible approach to growable arrays that anticipate future growth. Abstractly, what properties does an integer vector (growable array of integers) need to have?

Think, think, think …

• Are you ready? Put up an arm and acknowledge the judges. Take a deep breath. We're starting the big tumbling run!

In lab, you did a selection sort exercise. To refresh your memory, here is the first version you did (sorting based on the `year` field in a `Person` struct: `nam_selsort.cpp`.

Exercise:
• Convert this program to one that uses a `PersonVector` class.
• Write the `PersonVector` class.
What do you observe about this class in relation to the `IntVector` class? How could things be more modular?

### Debugging

• Use lots of output.

For debugging output, use `cerr` rather than `cout`, and begin such output with something obvious like `DEBUG:`.
`cerr` works exactly the same as `cout`, but you can redirect it separately. When you're done debugging, you can comment these lines out (and they'll be easy to find, because they use `cerr` and the string `DEBUG`).

• Use `valgrind`.
```valgrind ./pointer_example2
```
would be an interesting thing to run.
Mark A. Sheldon (msheldon@cs.tufts.edu)