The Second Annual ICFP Programming Contest Problem

The Premise

Frobozz Role Playing, Ltd, has obtained funding to promote the concept of wireless interactive fiction. Traditional interactive fiction emphasizes explorations of created worlds, but this new venture will be more like the ``multi-user dungeons'' popularized by such programs as TinyMUD, which emphasize interaction over exploration. Frobozz expects to distinguish itself in two ways: thanks to wireless technology, the game will be available anywhere, any time---and even if a player is disconnected, there will always be computer-controlled ``non-player characters'' to interact with.

Creating computer-controlled characters is still a black art, and the ``mimesis team'' has developed a new, ultra-high-level language to specify their behaviors. The high-level compiler translates these behaviors into state machines. Unfortunately, the logic used to decide on state transitions is looking very complicated, and management is concerned that the state machines may exceed both code-size and power budgets for the hand-held devices---and nobody is going to buy unless price is low and battery life is high. Your task, for the next 72 hours, is to write an optimizer that will improve the state machines emitted by the compiler. Speed is nice, but management is more concerned about code size---if the system is 10% slower than projected, users might not notice, but if the system is 10% larger than projected, that means more memory, and that eats into profits.

A low-level language for non-player characters

The compiler emits a state machine for each character. Each state has an integer label, and a (possibly complicated) rule telling it how to make transitions. This rule may test arbitrarily many ``state variables'' of the entire system. Eventually, after a sequence of tests, the rule decides on a new state to enter, and an utterance to be issued to the screen. The new state is either an integer state number or an underscore, in which case the new state is the same as the one which the rule matched. The utterance is given by a string literal.

The current state may be shared by multiple characters, but each individual character's decision is executed atomically. This means that the initial state is unspecified, and no states are unreachable.

Because machine language is the wrong level of abstraction for this problem, the character compiler emits an intermediate language, which in its semantics is very much like C (or perhaps like C--). Its concrete syntax is, however, S-expressions:

  character    ::= ({rule})
  rule         ::= ({state-number} stmt)
  stmt         ::= (IF condition stmt ({elseif}) stmt)
                |  (DECISION new-state utterance)
                |  (CASE variable ({arm}) stmt)
  elseif       ::= (ELSEIF condition stmt)
  arm          ::= (ARM value-set stmt)
  condition    ::= (EQUALS variable int)
                |  (AND {condition})
                |  (OR  {condition})
  variable     ::= (VAR string)
  state-number ::= int
  new-state    ::= int
                |  wildcard
  wildcard     ::= _ 
  utterance    ::= string
  value-set    ::= ({int})

A phrase enclosed in curly braces ({...}) is to be repeated zero or more times. An int is an integer literal and a string is a string literal. To make your job a bit easier, string literals will be very simple: lists of characters between double quotes, with no worrying about escapes. The characters \ and " will not appear in string literals.

In addition to conforming to the syntax above, valid characters also satisfy the following properties:

In each character, each state-number labels exactly one rule.
In each CASE statement, each int appears in the value-set of at most one arm.

These two rules ensure that the character's behavior is deterministic.

Late-breaking news: You may have access to the current state through the state variable called state. The Judges are embarrassed to confess that this means the character might have been more easily representible by a simple CASE statement. Ah, well.

To see a fairly serious example, check out ``The Busker,'' which is based loosely on the eponymous character from the interactive fiction title Christminster, by Gareth Rees.

Meaning and cost estimates

In order to optimize, you must know both what costs you are trying to minimize, and what program transformations preserve meaning. The rules below define both meaning and run-time costs of low-level programs, by showing how those programs might be rewritten in an interpreter, and at what cost. Because Frobozz has not yet chosen a target processor for the project (Palm VII anyone?), we have decided on costs by reasoning about translations to likely target machines. We write e => v @ n for ``expression e reduces to value v at cost n, and s => d @ n for ``statement s produces decision d at cost n.''

(IF true  t alts def) => t @ 0 
(IF false t ((ELSEIF e' t') alts) def) => (IF e' t' alts def) @ 0 
(IF false t () def) => def @ 0 
(DECISION _ u) => (_ u) @ 3
(DECISION state-number u) => (state-number u) @ 4
(CASE v arms def) => s @ 11.5, 
     where (ARM (vs) s) is in arms, 
     the value of v is k, and k is in vs
(CASE v arms def) => def @ 11.5, 
     where there is no (ARM (vs) s) in arms such that
     the value of v is k and k is in vs

(EQUALS v k) => true  @ 6.5, when the value of v is k
(EQUALS v k) => false @ 6.5, when the value of v is not k

(AND false es) => false @ 0 
(AND true  es) => (AND es) @ 0 
(AND) => true @ 0 

(OR true  es) => true @ 0 
(OR false es) => (OR es) @ 0 
(OR) => false @ 0

The dynamic costs of AND and OR might be surprising, but that is because they are accounted for in the EQUALS tests.

Our estimate of code size is given by the following rules. As with dynamic costs, we have imagined a translation, assuming each instruction has cost 1.

size (IF cond then alts def) = 
    size cond + size then + size alts + size def
size (ELSEIF cond stmt) = size cond + size stmt
size (DECISION _ u) = 3
size (DECISION state-number u) = 4
size (CASE v arms def) = 10 + size def + span arms + size arms
  where span arms is the number of integers in the
  smallest interval containing every integer label in arms
size (ARM ints stmt) = size stmt
size (EQUALS v k) = 6
size (AND tests) = size tests
size (OR tests) = size tests

The size of a list of items (as in size arms or size tests) is simply the sum of the sizes of the elements.

The problem you must solve

Write an optimizer that reads a state machine from standard input, optimizes the state machine, and emits the optimized machine to standard output.

The Unix utility cat would be a valid submission, although we hope it doesn't do well in the contest.
The state machines that will offer the most opportunity for optimization are likely to be machines that do not make effective use of the CASE statement.
Optimizers will be run under a time limit. The time limit, in integer seconds, will be will be available in the environment variable TLIMIT, and it will also be the first argument passed to your ./runme script on the command line. This time limit is real, wall-clock time, not ``user time'' or ``CPU time.''
We anticipate allowing up to 30 minutes for most state machines, which should be more than enough time. For selected, difficult state machines, used in the late stages of the judging, the Judges may allot longer times, depending on how many contestants remain at that point. Optimizers exceeding their time allotment will be killed.

Evaluation

We will run the optimizers on several state machines, and we will evaluate them according to three criteria:

Answer-- We will run both unoptimized and optimized state machines in several states. Each optimizer will score 1 point every time the optimized version produces the same decision as the original version. Only optimizers with maximal scores will proceed to the next criterion. (In other words, if your optimizer changes the behavior of the state machine, and we can catch it, you lose.)
Code size-- The optimizer that emits the smallest state machines wins---provided of course those machines get the right answers.
Speed-- It may well be that several competing optimizers produce equally small, equally correct state machines. In that case, the judges will break ties in two ways:
- Estimated running times of the optimized state machines on the test inputs. Faster state machines win.
- Running times of the optimizers themselves. If two optimizers produce state machines that get the same number of right answers, are the same size, and have the same estimated running times, but one optimizer is significantly faster than another, then the faster optimizer wins. Differences of 10% or less in the running times of the optimizers are unlikely to seem significant in the eyes of the judges. Factors of two are eye-opening.

Submission instructions

The contest will be run on Linux. Contest entries are to be submitted using our Web submission form. A valid entry must be a gzip'd tar file containing:

An executable program ./runme that reads a state machine from standard input and produces an optimized state machine on standard output. This program is likely to be a Linux binary or a shell script.
A ./README file containing a brief description of the submission, algorithm, programming language(s) used, and anything else that you want to bring to the attention of the judges. At minimum, the README file must contain
- Indication of language(s) used. To avoid debate about just what constitutes a functional language, we will accept programs written in any language.
- The names and e-mail addresses of all team members
The contents of the README file are likely to have some bearing on the Judges' Prize, so it may be worth including more information.
A ./src directory containing the source code and build files for your optimizer. The judges do not plan to build your optimizer from these sources---they are there for informational purposes only. Make sure your submission includes a working binary.
An optional ./support directory containing libraries, binaries, configuration files, and anything else that ./runme might need.

The contest platform is a 450MHz Pentium III with 256MB RAM running Red Hat Linux version 6.0. The contest machine will not necessarily have compilers, interpreters, or shared libraries for your implementation language. For safety's sake, submit a statically linked binary if you can. If your favorite language cannot emit executable binaries, then let ./runme be an executable shell script that runs an interpreter binary located in ./support.

The submission form asks for a team name, which uniquely identifies your team and its entry, a password for the entry (so you can submit revisions), and a contact e-mail address. You can overwrite previous entries by submitting an entry of the same name, using the same password.

When submitting an entry you may also supply an MD5 checksum for your gzip'd tar file. If we do not compute the same MD5 checksum that you supply, you will receive a warning, and your entry will not be recorded.

Rules and deadlines

All entries must be received by 5PM EDT (21:00 UTC) on Sunday, September 5, 1999. Late entries will not be accepted.

Multiple entries from the same team will be permitted only if the entries are substantially different in the eyes of the judges. In other words, you can't flood the contest with a bunch of variations on a theme, and let us pick the best one for you. Submit the single variation that you determine is likely to do well. If in doubt about whether two possible entries are substantially different, consult the judges.

In recognition that many functional programmers have family obligations that might prevent them from spending 72 hours on a contest, this year's contest will have a special ``lightning division,'' which will consist of entries submitted in the first 24 hours, that is, entries submitted by 5PM EDT (21:00 UTC) on Friday, September 3, 1999. There are no separate prize categories for lightning entries; they will compete with all other entries on an equal basis. Lightning entries may, however, be singled out for special mention at the awards ceremony, and speedy programming may well impress the judges, having some influence on the bestowal of the Judges' Prize.

A single entrant or team may submit both a lightning entry and a regular entry, and both will be considered for prizes, even if the entries are not deemed substantially different.

Questions

Have questions about the contest? Consult our Frequently Asked Questions page, or mail questions to icfp99@cs.virginia.edu.

Back to main contest page

Norman Ramsey | Kevin Scott | icfp99@cs.virginia.edu