COMP 40 Assignment: Understanding and Using Interfaces

Design for part B only due Monday, September 12 at 11:59PM (parts A and C do not require a written design).
Full assignment (parts A, B, and C) due Thursday, September 15 at 11:59PM.

Please read the entire assignment before starting work.

Purpose

This assignment has four goals:
  1. To help you make the transition from C++ programming to the kind of C programming we expect in 40:

  2. To give you practice in identifying interfaces that can help you solve problems

  3. To start you thinking about the interface as a unit of design

  4. To introduce you to multiple representations of numbers

Pair programming

As will be discussed in class, pair programming is an effective way to more than double your programming productivity. Pair programming is required for at least two assignments, and it is strongly recommended for all assignments. Pair programming will help most with your design. For most of the term, you will choose your own pairs, but for this assignment, pairs will be assigned before the first lab.

Preliminaries

If you spend an extension token on any part of the assignment, it automatically applies to all parts of the assignment.

Part A: Brightness of a grayscale image

Please write a C program brightness, which should print the average brightness of a grayscale image. Every pixel in a grayscale image has a brightness between 0 and 1, where 0 is black and 1 is as bright as possible. Print the average brightness using decimal notation with exactly one digit before the decimal point and three digits after the decimal point. Print only the brightness.

The program takes at most one argument:

Examples:

My solution to this problem takes less than 35 lines of C code.

Help with image files

We provide code to help you read image files; you will find the Pnmrdr interface in /comp/40/include and the implementation in /comp/40/lib64. Creating a Pnmrdr_T will read the graymap header for you, and from the header you can compute how many pixels are in the image. (You should read exactly as many pixels as are there—no more, no fewer.) Don't forget that the brightness of each pixel is represented as a scaled integer, as described in the Pnmrdr interface.

Getting images

You can get images to play with by using one or more of the following programs:

Problem analysis and advice

The main issues here are: There is also an important issue of style:

Part B: Finding fingerprint groups

In this problem you are to write a program fgroups (short for "fingerprint groups"), which when given a set of names with fingerprints will identify groups of names that share a fingerprint. The real object of the exercise is to familiarize you with the CII library and with C I/O.

Your input is always on standard input. Input is a sequence of lines where each line has the following format:

Finally, you may also assume that each name appears at most once.

You may assume that a fingerprint is at most 512 characters long (2048 bits represented in hexadecimal notation), but there is no a priori upper bound on the length of a name.

What to do with good input

By the nature of the input, every fingerprint you read will be associated with at least one name. Your program should print all the fingerprint groups in the following format: Groups may be printed in any order.

To print a fingerprint group, print each name in the group. Put a newline after each name, as if by

   printf("%s\n", name);
Names within the group may be printed in any order.

For example, if the input is

A    Ron
A    Poppy
B    Bill
A    Dubya
B    Barry
Then one possible output is
Ron
Dubya
Poppy

Bill
Barry
(This example output contains exactly one blank line.)

My solution to this problem takes about 150 lines of C code. About 25 lines deal with input; about 25 lines are devoted to printing output; and about 50 lines are memory management and I/O. The actual algorithm is under 20 lines. The rest is comments, whitespace, #include directives, and the like.

What to do with bad input

Performance target

Your fgroups program should perform well on inputs of up to several hundred thousand lines. On the lab machines, it should be capable of processing a half a million lines per minute.

Problem analysis and advice

This problem boils down to simple string processing and standard data structures. Repeat: the data structures are already built for you; your job is to figure out which ones will be useful.

Getting started:

Design document for fgroups—due Monday, September 12

The heart of a software design lies in its representation of data. For this assignment, we are asking you to identify what data structures you will need to compute fingerprint groups, and what each data structure will contain. Your design document is due Monday, September 14 at 11:59PM.

Part C: Solving problems with fingerprint groups

The example above is not entirely facetious, but in this problem we ask you list problems you could solve with a working version of fgroups. Be imaginative. For one set of ideas, look up the man page for find, especially the -exec option. And it might help you to know that in the arcane jargon of computer science, a fingerprint is sometimes called a "message digest". This information is especially useful if you also know how to use the apropos command.

Please include your answer to part C in your README file.

General advice for new C programmers

Don't forget to initialize each C pointer variable. In C++, a new pointer variable may be initialized implicitly by a constructor. In C, you should use NEW to initialize each new pointer variable. (You might wish to take three minutes to view, or review, Pointer Fun with Binky, and compare the code you see there with Hanson's NEW macro.)

Use the programming idioms on the class web page!

Organizing and submitting your solutions

Avoid common mistakes

In this assignment, the most common mistakes involve producing output in a format other than what the specification calls for. You will learn to read specifications very carefully. Consider these questions:

Here are some other common mistakes: