Object-Oriented Programming in Smalltalk

COMP 105 Assignment

Parts 1 and 2 due Sunday, November 22, 2020 at 11:59PM
Part 3 due Tuesday, December 1 at 11:59PM
Part 3 due Wednesday, December 2 at 11:59PM

Overview

Object-oriented programming has been popular since the 1990s, and like lambdas, object-oriented features are found everywhere. But these features are not always easy to tease out: many object-oriented languages, such as Java and C++, are hybrids, which mix objects with abstract data types or other notions of encapsulation and modularity. When you don’t already know how to program with objects, hybrid designs are more confusing than helpful. For that reason, we study pure objects, as popularized by Smalltalk: even simple algorithms send lots of messages back and forth among a cluster of cooperating, communicating objects. Popular languages that use similar models include Ruby, JavaScript, Objective C, and Swift.

The assignment is divided into three parts.

This assignment is time-consuming. Many students have experience in languages called “object-oriented,” but few students have experience with the extensive inheritance and pervasive dynamic dispatch that characterize idiomatic Smalltalk programs.

Setup

The μSmalltalk interpreter is in /comp/105/bin/usmalltalk. Many useful μSmalltalk sources are included the book’s git repository, which you can clone by

git clone homework.cs.tufts.edu:/comp/105/build-prove-compare

Sources that can be found in the examples/usmalltalk directory include copies of predefined classes, collection classes, shape classes, and other examples from the textbook—including templates for both large integers and natural numbers.

Getting to know μSmalltalk interactively

Smalltalk is a little language with a big initial basis; there are lots of predefined classes and methods. To help you work with the big basis, as well as to debug your own code, we recommend sending the messages printProtocol and printLocalProtocol. These messages, which are shown in Figure 10.10 on page 648, are understood by every uSmalltalk class. They provide a quick way to remind yourself what messages an object understands and how the message names are spelled.

One indirect way to learn a protocol is to send an object every message in its protocol, just to see what happens. You might get a stack trace, but that will help you learn.

Assembling documentation for the assignment

The information you need is spread out over multiple sources:

Reading comprehension (10 percent)

These problems will help guide you through the reading. We recommend that you complete them before starting the other problems below. You can download the questions.

  1. Receivers, arguments, and message names. Read the first eleven pages of chapter 10, through the end of section 10.1.4. Now examine these expressions from the definition of class Picture, in figure 10.4 on page 627:

    (shapes add: aShape)
    (shape drawOn: aCanvas)
    (shapes do: [block (shape) (shape drawOn: aCanvas)])

    In each expression, please identify the receiver, the argument, and the message name:

    In (shapes add: aShape),

    • The receiver is …
    • The argument is …
    • The message name is …

    In (shape drawOn: aCanvas),

    • The receiver is …
    • The argument is …
    • The message name is …

    In (shapes do: [block (shape) (shape drawOn: aCanvas)]),

    • The receiver is …
    • The argument is …
    • The message name is …

    You can now start to read code examples.

  2. Colons in message names. Now move on to class TikzCanvas, which is described in Figures 10.5 and 10.6 on page 628. In both the protocol and the implementation, method startDrawing has no colon in the name, method drawPolygon: has one colon in the name, and method drawEllipseAt:width:height: has three colons in the name.

    • What, if anything, does the number of colons have to do with receivers?

      Your answer: …

    • What, if anything, does the number of colons have to do with arguments?

      Your answer: …

    If you need to, review the presentation in section 10.1.1 on “Objects and Messages,” which shows messages sent to shapes.

    You now know what a message’s name tells you about how to use it.

  3. Class protocols and instance protocols. Every message is part of some protocol. As example messages, study the transcript in code chunks 622b and 622c, which puts three shapes into a picture and then draws the picture.

    1. Of the message names used in the transcript, which ones are part of the class protocol for Picture?

      Which message names are part of the instance protocol for Picture?

      Which message names are part of the class protocol for TikzCanvas?

      Which message names are part of the instance protocol for TikzCanvas?

    2. In general, what do you do with messages in a class protocol, and how does that differ from what you do with messages in an instance protocol?

    You now know what you can do with each kind of message described in the chapter.

  4. Dynamic dispatch, part I: A toy class. For the mechanisms of message send and dynamic dispatch, read section 10.3.4, which starts on page 641. Using the class definitions in that section, message m1 is sent to an object of class C. What method definitions are dispatched to, in what order?

    Please edit this answer to put in the correct methods and classes:

    • Dispatch to method m1 on class ?
    • Dispatch to method ? on class ? …

    You are starting to understand the key mechanism behind all of object-oriented programming.

  5. Dynamic dispatch, part II: Conditionals? What conditionals? Read section 10.5 starting on page 662 in its entirety. (It is about one page.)

    Now study this transcript:

    -> ('hello isNil)
    <False>
    -> (nil isNil)
    <True>
    -> ('hello class)
    <class Symbol>
    -> (('hello class) superclass)
    <class Object>
    -> (nil class)
    <class UndefinedObject>
    -> ((nil class) superclass)
    <class Object>

    Answer these two questions:

    1. Explain how it is possible that the isNil method can be implemented without an equality test and without any explicit conditional test (like an if). Your explanation should include a statement of what messages are sent to what objects and what method definition each message dispatches to.

    2. Explain why it is desirable to implement object-oriented code without explicit conditional tests, and indeed, without ever asking an object, “how were you formed?”

    You are starting to learn how to work effectively in a world where we don’t interrogate objects to see how they were formed or what class they are.

  6. Dynamic dispatch, part III: Number classes. Familiarize yourself with the protocols for magnitudes and numbers, as described in section 10.4.6, which starts on page 658. Now turn to section 10.7, which starts on page 670, and examine the key methods of class Number, as well as the implementation of class Fraction are in (section 10.7.2 starting on page 673). Part of class Fraction is relegated to the Supplement, but the only thing you need from there is this implementation of method negated on class Fraction:

    (method negated () ((Fraction new) setNum:den: (num negated) den))

    Now, demonstrate your understanding of dynamic dispatch by explaining what messages are sent when fractions are subtracted.
    For example, what messages are sent when we evaluate

    ((1 / 2) - (1 / 3)) ?

    The computation dispatches messages to instance methods of classes Fraction, Number, and SmallInteger, as well as a class method of class Fraction. We are interested in only some of those dispatches—ones that meet both of these criteria:

    • The message is sent from a method defined on class Fraction or class Number.

    • The message is received by an instance of class Fraction or class Number.

    These criteria rule out class methods of class Fraction, messages sent to SmallInteger, and so on.

    Starting with what happens when message - (minus) is sent to an instance of Fraction, please identify only the interesting dispatches:

    Message   Sent from method     Sent to object      Method defined
              defined on class     of class            on class
    
    -          (anywhere)          Fraction            Number
    
    ?         Number               ?                   ?
    
                     ... fill in the ? marks ...
    
              ... and complete the rest of this table ...

    You are starting to learn how a class can have some of its work done for it by its superclass.

  7. Dynamic dispatch, part IV: Messages to self and super. Now study the class method new defined on class List, which appears in section 10.8, which starts on page 681. The definition sends message new to super. (Keep in mind: because new is a class method, both super and self stand for the class, not for any instance.)

    1. When class method new is executed, what three messages are sent by the method body, in what order? (If you like, you can use the message tracer described under “Diagnostic technique” below, but it is probably easier just to look at the source code.)

    2. What does each of the three message sends accomplish?

    3. If we change new’s definition so instead of (super new) it says (self new), which of the following scenarios best describes how the changed program behaves?

      1. The new message will be dispatched to class List. The same method will run again, and the computation will not terminate.

      2. The new message will be dispatched to a different class, and the reply to the new message will leave the sentinel pointing to the wrong value.

      3. Nothing will change; in this example, there’s no difference between (super new) and (self new).

      Your answer: The best description is labeled with letter ?

    You are learning how to use super to initialize objects.

  8. Design of the numeric classes. Read about coercion in section 10.4.6 on page 658. Look at the last part of the instance protocol for Number on page 659, which shows the contracts of methods asInteger, asFraction, asFloat, and coerce:. If you are unsure, look at the implementations of these methods on class Integer, in chunks 678b and 678c.

    Answer the two questions (a) and (b). Write no English.

    1. Supposing x is a variable that holds a number—possibly a floating-point number. In C or C++, you can cast its value to int by writing (int) x. In uSmalltalk, what code do you write?

    2. Now suppose you have a uSmalltalk variable y, which also holds a number. You want to add x and y, but the contract of the + message says that it can be used only when x and y have the same representation (both integers, both floats, etcetera). Without interrogating either x or y to ask what class it is, how can you add them safely? What code do you write?

    You are ready to implement mixed arithmetic, with coercions, in exercise 39.

    You are now ready to implement all of the operations on numbers, not just the ones that are familiar from C or C++.

  9. Abstract classes in principle. In section 10.13.1, which starts on page 716 (“Key words and phrases”), you will find a short definition of “abstract class.” What is the purpose of an abstract class? Pick one of the responses below.

    1. To hide the representation of instances so programmers can change internal details without affecting client code

    2. To define methods that other classes inherit, so that subclasses get useful default methods

    3. The same as the purpose of a regular class: to define an abstraction

    Your answer: …

    You now know what abstract classes are supposed to do, so that when you tackle the homework, you can take advantage of abstract classes like Magnitude, Number, and Integer.

  10. Abstract classes in practice: magnitudes and numbers. Your natural-number class will inherit from abstract class Magnitude, and your large-integer code will inherit from Magnitude and from Number, which is also an abstract class.

    1. Study the implementation of class Magnitude (page 665). List all the methods that are “subclass responsibility”:

      Your answer: ...

      These are methods that you must implement in both your Natural class and your large-integer classes.

    2. On page 672, you will find the definition of class Number, which I will save you the trouble of looking up:

      (class Number
          [subclass-of Magnitude]  ; abstract class
          ;;;;;;; basic Number protocol
          (method +   (aNumber)     (self subclassResponsibility))
          (method *   (aNumber)     (self subclassResponsibility))
          (method negated    ()     (self subclassResponsibility))
          (method reciprocal ()     (self subclassResponsibility))
      
          (method asInteger  ()     (self subclassResponsibility))
          (method asFraction ()     (self subclassResponsibility))
          (method asFloat    ()     (self subclassResponsibility))
          (method coerce: (aNumber) (self subclassResponsibility))
          <<other methods of class [[Number]]>>
      )

      Your large-integer classes will be subclasses of Number. How many of the eight subclassResponsibility methods will you have to implement in those classes?

      Your answer: ...

      (Two of these methods, + and *, must also be implemented in class Natural.)

    You now understand in detail what operations your large integers must support.

  11. Double Dispatch. Read section 10.7, which starts on page 670. And read the section “laws for multiple dispatch” in the 7th lesson on program design (“Program Design with Objects”).

    Now, of the methods on class Number listed in the previous question, list each one that needs to know either of the following facts about its argument (not its receiver):

    • Whether the argument is large or small
    • If the argument is large, whether it is “positive” or “negative”

    For example, + is such a method.

    1. Please list all such methods here:

      Your answer: + …

    2. The methods in part (a) are exactly the ones that require double dispatch. The implementation of each such method sends a message to its argument, and the exact message depends on the class of the receiver.

      Assume that the receiver is a LargePositiveInteger. Please say, for each method in part (a), what message the method implementation sends to the argument.

      Your answer:

      Method + sends addLargePositiveIntegerTo: to the argument

    You are ready to implement arithmetic on large integers (exercise 38).

Individual Problem

Working on your own, please work exercise 36(a) on page 731 of Build, Prove, and Compare. This exercise is a warmup designed to prepare you for the bignum problems in the pair portion of the assignment.

Pair Problems: Bignum arithmetic

For the remaining problems, you may work with a partner. Please work exercise 37 on page 731, exercise 38 on page 732, and exercise 39 on page 732 of Build, Prove, and Compare, and exercises T and ADT below.

Sometimes you want to do computations that require more precision than you have available in a machine word. Full Scheme, Smalltalk, Haskell, Icon, and many other languages provide “bignums,” which automatically expand to as much precision as you need. Unlike languages that use abstract data types, Scheme, Smalltalk, and Icon make the transition from machine integers to bignums transparent—from the source code, it’s not obvious when you’re using native machine integers and when you’re using bignums. You will build transparent bignums in Smalltalk.

Big picture of the solution

Smalltalk code sends lots of messages back and forth, and those messages are dispatched to lots of methods, which are defined on different classes. This model shows both the power of Smalltalk—every method is simple—and a weakness of Smalltalk—every algorithm is smeared out over half a dozen methods defined on different classes, making it hard to debug. But this is the object-oriented programming model that lends power not just to Smalltalk but also to Ruby, Objective-C, Swift, Self, JavaScript, and to a lesser extent, Java and C++. To work effectively in any of these languages, one needs the big picture of which class is doing what. A good starting point is the Smalltalk bignums handout.

Unit testing bignums in Smalltalk

Arithmetic comparisons should be tested using check-assert in the usual way. But other arithmetic operations can’t easily be tested using check-expect, because the parser handles only machine integers. You have two options:

Using check-expect

The semantics of check-expect are subtle, but the tests work about the way you would hope. You might remember that in μScheme, check-expect does not use the built-in = primitive—instead, it compares values uses something like the equal? function. This comparison enables μScheme’s check-expect to accept two different S-expressions that have the same structure. In much the same way, μSmalltalk does not use a primitive equality; instead, it sends the = message to the checked object.

Using check-expect with the decimal method looks like this:

(check-expect ((Power x:to: 10 60) decimal)
              '( 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
                   0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
                   0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))
(check-expect ((Power x:to: 2 64) decimal)
              '( 1 8 4 4 6 7 4 4 0 7 3 7 0 9 5 5 1 6 1 6 ))
(check-expect (((Power x:to: 2 64) negated) decimal)
              '( - 1 8 4 4 6 7 4 4 0 7 3 7 0 9 5 5 1 6 1 6 ))

Because check-expect uses the = method, it can accept a list of decimal digits returned by decimal as equal to the array written using literal quote syntax—even though the two are not the same object.

Using check-print

Using check-expect is familiar, but aside from the obvious awkwardness of writing arrays of decimal digits, it has a more serious flaw: it can’t detect bugs in your all-important decimal method, which supplies digits to print. To help find such bugs, use the check-print unit-test form, which takes an expression and a literal result. The literal result must be a single token: either a name or a sequence of digits. Here are some example uses:

(check-print (Power x:to: 10 60)
             1000000000000000000000000000000000000000000000000000000000000)
(check-print (Power x:to: 2 64)
              18446744073709551616)
(check-print ((Power x:to: 2 64) negated)
             -18446744073709551616)
(check-print ((Natural fromSmall: 256492) * (Natural fromSmall: 666481))
             170947044652)

As soon as you have a decimal method working, I recommend using check-print.

Our unit tests

We provide some unit tests for class Natural and for bignums. But our tests are not like the unit tests you are used to. We don’t hand-craft each test to exercise a particular method in a particular way. Instead, we use a random-number generator to create relatively big tests that exercise many methods at once—especially the all-important decimal method. If your code passes our tests, it should give you some confidence. But if your code fails one of our tests, the failure won’t tell you what went wrong. For that reason, our tests are meant to supplement your own unit tests, not to supplant them.

Details of all the problems

37. Implementing arbitrary-precision natural numbers. Do exercise 37 on page 731 of Build, Prove, and Compare. Implement the protocol defined in Figure 10.19 on page 662.
Put your solution in file bignum.smt.

The book has a starter template for class Natural, which you can copy and paste (with acknowledgement) from /comp/105/build-prove-compare/examples/usmalltalk/predefined.smt.

The choice of a base for natural numbers is yours. But for full credit, you must choose a base b such that b > 10, and small enough that (b + 1) ⋅ (b − 1) does not overflow.1

0pt plus 3-200 0pt plus -3

What representation should I use? You’ve got three choices:

The representations offer these tradeoffs:

In our opinion, the subclass-based solution is clearly superior. (As a bonus, this representation is also the best at helping you learn about programming with objects.) This representation makes it easy for me to understand my own code: I can look at any method, and I am confident that I understand exactly what it is doing. The array and list representations don’t give me that confidence.

How must I document my representation?

For any choice of representation, it is useful to explicitly document an abstraction function that maps the representation to the value it represents, and an invariant predicate that returns true when applied to a representation iff the representation satisfies the expected invariants. In your submission, you must document the abstraction function and invariant for every concrete class. (A concrete class is one that is instantiated by sending it the new message, or by sending some other message whose method eventually sends new.)

In each class, place your comments immediately below or to the right of the declaration of the instance variables.

What algorithms should I use?. The handout Mastering Multiprecision Arithmetic describes algorithms for arithmetic operations that you might find useful. The descriptions of the algorithms come after a discussion of how you might represent infinite precision numbers in ML, which you should skim to understand the descriptions of the algorithms.

How big is it? Following the hints in the book, the course solution comprises two implementations of class Natural:

We have not written a solution that uses Smalltalk lists.

Related reading:

  • There is a detailed implementation guide in the bignums handout. It discusses both array-based and subclass-based representations.

  • The handout Mastering Multiprecision Arithmetic describes algorithms for arithmetic operations. It’s written in the context of ML, but the description of the algorithms for arithmetic may be helpful.

  • In a system with abstract data types, binary operations like + and * are easy: you automatically have access to both representations. In a system with objects, not so! To learn how to get access to multiple representations in the object-oriented way, read section 10.7, which starts on page 670.

  • In the 7th lesson on program design, read the section on how the design steps are adapted for use with objects. Focus on steps 6, 7, and 8: algebraic laws, case analysis, and results. In the same lesson, you may also wish to revisit the three levels of access to representation. You will need level C, but there is no need for double dispatch here.

  • Class Natural is a subclass of Magnitude. Study the Magnitude protocol in section 10.4.6. For information about the implementation of Magnitude, which should provide useful ideas about Natural, as well as the “subclass responsibilities,” study the implementation of Magnitude on page 665.2

  • For the interface to a Smalltalk array, study the Collection protocol in section 10.4.5, which starts on page 651. You have access to the protocol in Figure 10.13 on page 653, including the class method withAll:. Among instance methods, you are more likely to use the KeyedCollection protocol in Figure 10.14 on page 656, especially at: and at:put:. You may also want the class method new: that is defined on arrays (page 683).

  • For list construction, which you will need for the decimal method, look at the List protocol in section 10.4.5, especially Figure 10.16 on page 658.

  • For abstraction functions and invariants, you may wish to read the program-design lesson on abstract data types (lesson 6).

38. Implementing arbitrary-precision integers. Do exercise 38 on page 732 of Build, Prove, and Compare. An implementation is shown in the bignums handout. Add your solution to file bignum.smt, following your solution to exercise 37.

Because you build large integers on top of Natural, you don’t have to think about array, list, or subclass representations. Instead you must focus on dynamic dispatch and on getting information from where it is to where it is needed.

The book has starter code for class LargeInteger, which you can copy (with acknowledgement) from /comp/105/build-prove-compare/examples/usmalltalk/large-int.smt.

How must I document my representation? As on the previous exercise, you must document the abstraction function and invariant for every concrete class. For example if you follow the guidelines and define an abstract class LargeInteger with two concrete subclasses LargePositiveInteger and LargeNegativeInteger, you’ll need to document only the two concrete subclasses.

How big is it? The course solutions for the large-integer classes are 30 lines apiece.

Related reading: This problem is all about dynamic dispatch, including double dispatch.

  • Read section 10.7, which starts on page 670.

  • Read the last section, “Laws for double dispatch,” in the 7th lesson on program design.

You’ll also have a chance to practice double dispatch in the second Smalltalk recitation.

Helpful code. Because of the potential for negation (“unary minus”) to overflow, division of signed integers is fiddly. For those of you who prefer not to translate the nested conditionals found in the divide function in the μScheme chapter, here are some methods defined in the solution large-integer classes.

To help with two’s-complement representations, the following code on class LargeInteger will ensure that the negated method defined on SmallInteger does not overflow:

  (class-method fromSmall: (anInteger)
     ((anInteger isNegative) ifTrue:ifFalse: 
        {(((self fromSmall: 1) + (self fromSmall: ((anInteger + 1) negated))) negated)}
        {((LargePositiveInteger new) magnitude: (Natural fromSmall: anInteger))}))

39. Modifying SmallInteger so operations that overflow roll over to infinite precision. Do exercise 39 on page 732 of Build, Prove, and Compare. Put your solution in a fresh file, mixnum.smt. On the first line of file mixnum.smt, include your other solutions by writing (use bignum.smt).3

You must modify SmallInteger without editing the source code of the μSmalltalk interpreter. To do so, you will use the protocol for classes described Figure 10.10 on page 648. You can see an example of the protocol in use in chunk 732a.

The methods in this protocol will modify the existing class SmallInteger; they can both change existing methods and define new methods. This code changes the basic arithmetic operations that the system uses internally. If you have bugs in your code, the system will behave erratically. At this point, you must restart your interpreter and fix your bugs. Then use the idiom again.

How big is it? The course modifications to the SmallInteger class are about 35 lines.

Related reading: Everything about dispatch and double dispatch still applies, especially the example in the 7th lesson on program design. In addition, you need to know how overflow is handled using “exception blocks.”

  • Review the presentation of blocks, especially the parameterless blocks (written with curly braces) in section 10.4.3, which starts on page 648.

  • Read the description of at:ifAbsent: in the keyed-collection protocol in Figure 10.14 on page 656. Now study this expression:

    ('(0 1 2) at:ifAbsent: 99 {0})

    This code attemps to access element 99 of the array ( 0 1 2 ), which is out of bounds because the array only has only 3 elements. When given an index out of bounds, at:ifAbsent: sends value to the “exception block” {0}, which ultimately answers zero.

  • Study the implementation of the at: method in code chunk 694c, which uses at:ifAbsent: with an “exception block” that causes a run-time error if value is sent to it.

  • Finally, study the overflow-detecting primitives in exercise 39 on page 732, and study the implementation of addSmallIntegerTo: in the code chunk immediately below. That is the technique you must emulate.

T. Testing Bignums. In standalone file bigtests.smt, you will write 9 tests for bignums:

These tests will be run on other people’s code, and they need to be structured and formatted as follows:

  1. The test must begin with a summary characterization of the test in at most 60 characters, formatted on a line by itself as follows:

         ; Summary: .........

    The summary must be a simple English phrase that describes the test. Examples might be “Ackermann’s function of (1, 1),” “sequence of powers of 2,” or “combinations of +, *, and - on random numbers.”

  2. Code must compute a result of class Natural, LargePositiveInteger, or LargeNegativeInteger. The code may appear in a method, a class method, a block, or wherever else you find convenient. The code must be included in file bigtests.smt.

  3. The expected result must be checked using the check-print form described above.

Each test must take less than 2 CPU seconds to evaluate.

Here is a complete example containing two tests:

; Summary: 10 to the tenth power, linear time, mixed arithmetic
(class Test10Power
  [subclass-of Object]
  (class-method run: (power)
     [locals n 10-to-the-n]
     (set n 0)
     (set 10-to-the-n 1)
     ({(n < power)} whileTrue:
         {(set n (n + 1))
          (set 10-to-the-n (10 * 10-to-the-n))})
     10-to-the-n)
)
(check-print (Test10Power run: 10) 10000000000)

; Summary: 10 to the 30th power, mixed arithmetic
(check-print (Test10Power run: 30) 
             1000000000000000000000000000000)

Here is another complete example:

; Summary: 20 factorial
(define factorial (n)
  ((n isStrictlyPositive) ifTrue:ifFalse: 
     {(n * (factorial value: (n - 1)))}
     {1}))

(check-print (factorial value: 20) 2432902008176640000)

Related reading: No special reading is recommended for the testing problem. As long as you understand the examples above, that should be enough.

ADT. Collect representations, abstraction functions, and invariants.

In a new file adt.txt, summarize your design work:

Two simple sanity checks for multiplication

It’s comforting to have an idea that things have started to work. Here are two comforting (but limited) tests. Here, for example, is a version of power:

(class Power
  [subclass-of Object]
  (class-method x:to: (x n) [locals half]
    ((n = 0) ifTrue:ifFalse: 
      {(x coerce: 1)}
      {(set half (self x:to: x (n div: 2)))
       (set half (half * half))
       (((n mod: 2) = 1) ifTrue:
         {(set half (x * half))})
       half})))

(check-expect (Power x:to: 2 3) 8)
(check-expect (Power x:to: 3 4) 81)
(check-expect (Power x:to: (1 / 3) 3) (1 / 27))

And here is code that computes and prints factorials, from code chunk 757b in the textbook:

(class Factorial
  [subclass-of Object]
  (class-method printUpto: (limit) [locals n nfac]
     (set n 1)
     (set nfac 1)
     ({(n <= limit)} whileTrue: 
        {(n print) ('! print) (space print) ('= print) (space print) (nfac println)
         (set n (n + 1))
         (set nfac (n * nfac))})))

Depending on your choice of base, (Factorial printUpto: 20) may run to completion almost instantly, or it may exhaust the “CPU fuel” used to recover from infinite loops—so you may need to test it while running

env BPCOPTIONS=nothrottle usmalltalk

These tests may be comforting, but they suffer from grave limitations:

These limitations make power and factorial poor tests of correctness.

More advice about testing natural numbers

Try testing your class Natural by generating a long, random string of digits, then computing the corresponding number using a combination of addition and multiplication by 10. You can generate a string of random digits on the command line by launching the bash shell and running this command:

for ((i = 0; i < 20; i++)); do echo -n ' ' $((RANDOM % 10)); done; echo

0pt plus 3-200 0pt plus -3

You can generate a test command from a list of digits using μScheme:

(define nat-test (ns)                        ; alert!  μScheme code
  (letrec ([exp-of (lambda (ns) 
                      (if (null? ns)
                          0
                          (list3 (car ns) '+ (list3 10 '* (exp-of (cdr ns))))))])
    (let* ([left  (lambda () (printu 40))]
           [right (lambda () (printu 41))]
           [space (lambda () (printu 32))]
           [_ (left)]
           [_ (print 'check-print)]
           [_ (space)]
           [_ (print (exp-of (reverse ns)))]
           [_ (space)]
           [_ (app print ns)]
           [_ (right)]
           [_ (printu 10)])
      'Printed!)))

For example,

-> (nat-test '(1 2 3))
(check-print (+ 3 (* 10 (+ 2 (* 10 (+ 1 (* 10 0)))))) 123)
Printed!

If you don’t have multiplication working yet, you can use the following class to multiply by 10:

(class Times10
    [subclass-of Object]
    (class-method by: (n) [locals p]
        (set p n)       ; p == n
        (set p (p + p)) ; p == 2n
        (set p (p + p)) ; p == 4n
        (set p (p + n)) ; p == 5n
        (set p (p + p)) ; p == 10n
        p))

This idea will test only your addition; if you have bugs there, fix them before you go on.

You can write, in μSmalltalk instead of μScheme, a method that uses the techniques above to convert a sequenceable collection of decimal digits into a natural number.

Once you are confident that addition works, you can test subtraction of natural numbers by generating a long random sequence, then subtracting the same sequence in which all digits except the most significant are replaced by zero.

You can create more ambitious tests of subtraction by generating random natural numbers and using the algebraic law (m + n) − m = n. You can also check to see that unless n is zero, m − (m + n) causes a run-time error on class Natural.

It is harder to test multiplication, but you can at least use repeated addition to test multiplication by small values. The timesRepeat: method is defined on any integer.

You can also easily test multiplication by large powers of 10.

You can use similar techniques to test large integers.

If you want more effective tests of multiplication and so on, compare your results with a working implementation of bignums. The languages Scheme, Icon, and Haskell all provide such implementations. (Be aware that the real Scheme define syntax is slightly different from what we use in uScheme.) We recommend you use ghci on the command line; standard infix syntax works. If you want something more elaborate, use Standard ML of New Jersey (command sml), which has an IntInf module that implements bignums.

Here’s an example of using ghci to generate numbers for testing:

$ ghci
GHCi, version 8.0.1: http://www.haskell.org/ghc/  :? for help
Prelude> 256492 * 666481
170947044652
Prelude> 170947044652 * 293847926754
50232434655713663419608
Prelude> 

Hints and guidelines from past students

Student’s experiences are in italics; course responses are in the standard font.

Other hints and guidelines

Start early. Seamless arithmetic requires in-depth cooperation among about eight different classes (those you write, plus Magnitude, Number, Integer, and SmallInteger). This kind of cooperation requires aggressive message passing and inheritance, which you are just learning. There is a handout online (the “Smalltalk bignums handout”) with suggestions about which methods depend on which other methods and in what order to tackle them.

The whole story about division

We do implement division, but not fully. Here’s why:

To enable our code to use division without requiring long division, we have invented messages sdiv:, smod:, and sdivmod:with:. These messages accept only short divisors, and each such divisor must be represented as an object of class SmallInteger. And in order to enable these messages to interoperate with Smalltalk’s standard integer division, we continue to support the standard messages div: and mod:. The resulting system design is a little confusing, but here’s what we know:

Avoid common mistakes

Below you will find some common mistakes to avoid.

It is a common mistake to ignore the design process. Don’t be like the students who said,

We followed the design process for assignments where we were to design individual functions but when the specification told us what to do stepwise, like in the Naturals homework, we didn’t.

It is common to overlook class methods. They are a good place to put information that doesn’t change over the life of your program.

It’s a terrible mistake to make decisions by interrogating an object about its class—a so-called “run-time type test.” Run-time type tests destroy behavioral subtyping. This mistake is most commonly made in two places:

There is a right way to do case analysis over representations: entirely by sending messages. For an example, study how we calculate the length of a list: we send the size message to the list instance. Method size is dispatched to class Collection, where it is implemented by using a basic iterator: the do: method. If you study the implementation of do: on classes Cons and ListSentinel (which terminates a μSmalltalk list), you’ll see the case analysis is done by the method dispatch:

The idea of “case analysis by sending messages” applies equally well to arithmetic—and the suggestions in the bignums handout are intended to steer you in the right direction. If you find yourself wanting to ask an object what its class is, seek help immediately.

On smaller problems, it is surprisingly common for students to submit code without ever even having run it or even loaded it into an interpreter. If you run even one test case, you will be ahead of the game.

It is too common to submit bignum code without having tested all combinations of methods and arguments. Your best plan is to write a program, in the language if your choice, that loops over operator and both operands and generates at least one test case for every combination. Because μSmalltalk is written using S-expressions, you could consider writing this program in μScheme—but any language will do.

It is relatively common for students’ code to make a false distinction between two flavors of zero. In integer arithmetic, there is only one zero, and it always prints as “0”.

It’s surprisingly common to fail to tag the test summary with the prefix Summary:, or to forget it altogether.

It’s a common mistake to redefine negated on class SmallInteger and to assume that nothing needs to be done with -. But SmallInteger is using machine primitives for arithmetic, and machine primitives provide addition and subtraction, but not negation. So the class definition looks like this:

(class SmallInteger
    [subclass-of Integer] ; primitive representation
    ...
    (method negated    ()  (0 - self))
    (method +          (n) (primitive + self n))
    (method -          (n) (primitive - self n))
    ...
)

If you are expecting it to look exactly like class Number and you stick with the default implementation of -, you might get some baffling infinite recursions in exercise 39.

In exercise 39, it’s a common mistake to try to implement (self - anArgument) by evaluating (anArgument - self)—or more precisely, by evaluating ((anArgument asLargeInteger) - self). This recursion terminates, but it produces an answer with the wrong sign. If you get “Array index out of bounds” errors, look for this mistake.

It’s not common, but if you rely on the recommended invariant for the subclass-based approach (n and d are not both zero), forgetting to enforce the invariant is a bad mistake.

Three diagnostic techniques

To help you diagnose problems in your code, we recommend three diagnostic techniques: dynamic tracing, static spell checking, and static call-graph analysis.

Tracing

The interpreter can help you debug by emitting a call trace of up to n message sends and answers. Just enter the definition

 (val &trace n)

at the read-eval-print loop. To turn tracing off, (set &trace 0).

Here is an (abbreviated) example trace of message new sent to class List, which is the subject of one of the reading-comprehension questions:

-> (List new)
standard input, line 3: Sending message (new) to class List
  predefined classes, line 499: Sending message (new) to class SequenceableCollection
  predefined classes, line 499: (<class List> new) = <List>
  predefined classes, line 499: Sending message (new) to class ListSentinel
    predefined classes, line 489: Sending message (new) to class Cons
    predefined classes, line 489: (<class ListSentinel> new) = <ListSentinel>
    predefined classes, line 490: Sending message (pred: <ListSentinel>) to an object of class ListSentinel
    predefined classes, line 490: (<ListSentinel> pred: <ListSentinel>) = <ListSentinel>
    predefined classes, line 491: Sending message (cdr: <ListSentinel>) to an object of class ListSentinel
    predefined classes, line 491: (<ListSentinel> cdr: <ListSentinel>) = <ListSentinel>
  predefined classes, line 499: (<class ListSentinel> new) = <ListSentinel>
  predefined classes, line 499: Sending message (sentinel: <ListSentinel>) to an object of class List
  predefined classes, line 499: (<List> sentinel: <ListSentinel>) = <List>
standard input, line 3: (<class List> new) = <List>
<trace ends>
List( )

One warning: as usual, the interpreter responds to an expression evaluation by printing the result. But in Smalltalk, printing is achieved by sending the println message to the result object. This interaction is also traced, and the results can be startling. Here, for example, is the result of evaluating the literal integer 3:

-> 3
internally generated SEND node, line 1: Sending message (println) to an object of class SmallInteger
  internal syntax, line 8: Sending message (print) to an object of class SmallInteger
3  internal syntax, line 8: (3<SmallInteger> print) = 3<SmallInteger>
  internal syntax, line 8: Sending message (print) to an object of class Char

  internal syntax, line 8: (<Char> print) = 10<SmallInteger>
internally generated SEND node, line 1: (3<SmallInteger> println) = 3<SmallInteger>

As you see, even a simple operation like printing a number involves multiple messages. To prevent them from confusing you, load the following code:

(val &trace 0)
(class TraceMethod
  [subclass-of Object]
  (method traceFor: (n) [locals answer]
    (set &trace n)
    (set answer (self value))
    (set &trace 0)
    answer)
  (method trace () (self traceFor: -1)))
(Block addMethods:from: '(traceFor: trace) TraceMethod)

You can then run something like ({8 squared} trace) or ({(List new)} traceFor: 10)

Tools that analyze your code for potential faults

Smalltalk has no type system. But some properties of of your code can still be checked statically:

An analysis you can do by hand

There is a static analysis you can do by hand called call-graph analysis. It can help you understand how classes work together, and it’s the best tool for diagnosing problems with infinite loops and recursions.

Call-graph analysis works by identifying what methods transfer control to what other methods. Every node in the call graph is a combination of class name and message name. For example, Boolean/ifTrue: or List/select:. Each node has outgoing edges that illustrate what happens if the node’s message is sent to an instance of the node’s class:4

  1. If the message is implemented by a primitive method, like + on SmallInteger, it has no outgoing edges.

  2. If the method is inherited from the superclass, the node has a dotted edge to the same message on the superclass. For example, node True/ifTrue: has a dotted edge to Boolean/ifTrue:.

  3. If the method is a subclass responsibility, the node has a dotted edge to each subclass.

  4. If the method is defined on the class, the node has a solid outgoing edge for each message that could be sent during the method’s execution.

    You have to look at each message send and figure out what class of object it might be sent to. This part can’t easily be determined by looking at the code; you have to know the protocol. For example, if message & is sent to a Boolean, we know the argument is also a Boolean. As another example, if message + is sent to a natural number, the protocol says the argument obeys the Natural protocol.

    Usually all the possibilities are covered by a superclass. For example, even though message size could be sent to a List or an Array, you can just draw a single edge to node Collection/size. Sometimes you might have more then one outgoing edge per message—for example, if a message could be sent to an Integer or a Fraction, but not to any Number.

  5. If the method is not defined and not inherited from a superclass, the message is not understood, and your program is broken.

Here are some tips about call graphs:

A final note: not all of our TAs have used call graphs before. You may be learning together.

Extra credit

Seamless bignum arithmetic is an accomplishment. But it’s a long way from industrial. The extra-credit problems explore some ideas you would deploy if you wanted everything on a more solid foundation.

Speed variations. For extra credit, try the following variations on your implementation of class Natural:

  1. Implement the class using an internal base b = 10. Measure the time needed to compute the first 50 factorials. And measure the time needed to compute the first 50 Catalan numbers.

    The Catalan numbers, which make better test cases than factorial, are defined by these equations (from Wikipedia):

    C_0=1 and C_{n+1}=\sum _{i=0}^{n}C_{i}\cdot C_{n-i} 


    $$C_0 = 1 \mskip 30mu C_{n+1}=\sum _{i=0}^{n}C_{i}\cdot C_{n-i}$$

  2. Determine the largest possible base that is still a power of 10. Explain your reasoning. Change your class to use that base internally. Measure the time needed to compute the first 50 factorials, and also the time needed to compute the first 50 Catalan numbers.

  3. In both cases, measure the additional time required to print the numbers you have computed.

  4. Specialize your b = 10 code so that the decimal method works by simply reading off the decimal digits, without any short division. Measure the improvement in printing speed.

  5. Finally, try a compromise, like b = 1000, which should use another specialized decimal method, making both arithmetic and decimal conversion reasonably fast. Can this implementation beat the others?

Write up your arguments and your measurements in your README file.

Long division. Implement long division for Natural and for large integers. If this changes your argument for the largest possible base, explain how. This article by Per Brinch Hansen describes how to implement long division.

Mixed Comparisons. Make comparisons work, even with mixed kinds of integers. So for example, make sure comparisons such as (5 < (1000000 * 1000000)) produce sensible answers.

Space costs. Instrument your Natural class to keep track of the size of numbers, and measure the space cost of the different bases. Estimate the difference in garbage-collection overhead for computing with the different bases, given a fixed-size heap.

Pi (hard). Use a power series to compute the first 100 digits of pi (the ratio of a circle’s circumference to its diameter). Be sure to cite your sources for the proper series approximation and its convergence properties. Hint: I vaguely remember that there’s a faster convergence for pi over 4. Check with a numerical analyst.

What and how to submit: Individual problem

Submit these files by Sunday, November 22 at 11:59pm:

Please identify your solutions using conspicuous comments, e.g.,

     ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
     ;;;;
     ;;;;  Solution to Exercise XXX
     (class Array ...
      )

As soon as you have the files listed above, run submit105-small-solo to submit a preliminary version of your work. Keep submitting until your work is complete; we grade only the last submission.

What and how to submit: Pair problems

Submit these files by Tuesday, December 1 at 11:59pm:

As soon as you have the files listed above, run submit105-small-pair to submit a preliminary version of your work. Keep submitting until your work is complete; we grade only the last submission.

How your work will be evaluated

All our usual expections for form, naming, and documentation apply. But in this assignment we will focus on clarity and structure. To start, we want to be able to understand your code.

Exemplary Satisfactory Must improve
Clarity

• Course staff see no more code than is needed to solve the problem.

• Course staff see how the structure of the code follows from the structure of the problem.

• Course staff see somewhat more code than is needed to solve the problem.

• Course staff can relate the structure of the code to the structure of the problem, but there are parts they don’t understand.

• Course staff see roughly twice as much code as is needed to solve the problem.

• Course staff cannot follow the code and relate its structure to the structure of the problem.

Structurally, your code should hide information like the base of natural numbers, and it should use proper method dispatch, not bogus techniques like run-time type checking.
Exemplary Satisfactory Must improve
Structure

• The base used for natural numbers appears in exactly one place, and all code that depends on it consults that place.

Or, the base used for natural numbers appears in exactly one place, and code that depends on either consults that place or assumes that the base is some power of 10

• No matter how many bits are used to represent a machine integer, overflow is detected by using appropriate primitive methods, not by comparing against particular integers.

• Code uses method dispatch instead of conditionals.

• Mixed operations on different classes of numbers are implemented using double dispatch.

Or, mixed operations on different classes of numbers are implemented by arranging for the classes to share a common protocol.

Or, mixed operations on different classes of numbers are implemented by arranging for unconditional coercions.

• Code deals with exceptional or unusual conditions by passing a suitable exnBlock or other block.

• Code achieves new functionality by reusing existing methods, e.g., by sending messages to super.

Or, code achieves new functionality by adding new methods to old classes to respond to an existing protocol.

• An object’s behavior is controlled by dispatching (or double dispatching) to an appropriate method of its class.

• The base used for natural numbers appears in exactly one place, but code that depends on it knows what it is, and that code will break if the base is changed in any way.

• Overflow is detected only by assuming the number of bits used to represent a machine integer, but the number of bits is explicit in the code.

• Code contains one avoidable conditional.

• Mixed operations on different classes of integers involve explicit conditionals.

• Code protects itself against exceptional or unusual conditions by using Booleans.

• Code contains methods that appear to have been copied and modified.

• An object’s behavior is influenced by interrogating it to learn something about its class.

• The base used for natural numbers appears in multiple places.

• Overflow is detected only by assuming the number of bits used to represent a machine integer, and the number of bits is implicit in the value of some frightening decimal literal.

• Code contains more than one avoidable conditional.

• Mixed operations on different classes of integers are implemented by interrogating objects about their classes.

• Code copies methods instead of arranging to invoke the originals.

• Code contains case analysis or a conditional that depends on the class of an object.


  1. Test it.

  2. Note: an object of class Natural is not a Number as Smalltalk understands it. In particular, class Natural does not support methods negated or reciprocal.

  3. If there is a bug in your solution to exercise 39, it can break your solutions to the previous exercises. By putting the solution to exercise 39 in its own file, we make it possible to test your other code independently.

  4. If you encounter a class method, just call the message something like “class method new,” and proceed as you would otherwise.