Relational Algebra Implementation

Your goal in this assignment is to implement the operators of the Relational Algebra. You will be extending the code you wrote for Assignment 1. As in the previous assignment, your code must pass provided unit tests, and manage memory properly (no leaks; no reads from, or writes to unallocated memory).

Get started as follows: (You need to be careful to get the newer versions of Database.*, dbexceptions.h, Row.h, etc. Doing the setup exactly as described here will ensure this.)

Note that the archive contained two source files that were not present in Assignment 1: RelationalAlgebra.h and RelationalAlgebra.cpp. RelationalAlgebra.h specifies the interface you are to implement, with documentation describing the behavior of each function. You will need to fill in the functions in RelationalAlgebra.cpp.

Row

The Row class is a little different from the version in Assignment 1. In this assignment, the Row class has a new member function: at(unsigned), for accessing fields of the row by subscript. So this assumes that the implementation of Row is a vector or array. If your implementation did something different, you will need to modify your Row implementation accordingly.

RelationalAlgebra

RelationalAlgebra.h contains declarations of the functions comprising Relational Algebra:
Table *onion(Table *r, Table *s); Table *intersect(Table *r, Table *s); Table *diff(Table *r, Table *s); Table *product(Table *r, Table *s); Table *rename(Table *r, const NameMap &name_map); Table *select(Table *r, RowPredicate predicate); Table *project(Table *r, const ColumnNames &columns); Table *join(Table *r, Table *s);

(Note that union is a keyword in C++, so the union operator is represented by the onion function.)

Comments in RelationalAlgebra.h explain the behavior of each operator, along with a discussion of possible exceptions. You will need to implement these functions, in RelationalAlgebra.cpp. Your implementation will rely on the Database, Table and Row types, and as you proceed, you may find it useful to extend the interfaces of these types, (i.e., add member functions). Don't modify the provided interfaces, as the tests will then probably not compile.

Tables need names, but we don't really care about the names of these tables, (we don't use the names in any way). You can use Database::new_table_name() to generate a table name that will be guaranteed to be unique.

Each Relational Algebra operator creates a new table containing new rows. Yes, even rename. Specifically:

Tests

The unit tests are contained in test.cpp. main.cpp has the main() function which simply invokes the tests.

test.cpp contains 36 tests that test your relational algebra implementation, both "positive" cases (e.g. compute an intersection and see that the results are as expected), and "negative" ones, (e.g. try inserting a row with three columns into a table created with two columns).

Here is a typical positive test:
static void union_compatible_both_non_empty() { Table* r = Database::new_table("r", ColumnNames{"a", "b"}); add(r, {"1", "44"}); add(r, {"1", "55"}); add(r, {"2", "66"}); Table* s = Database::new_table("s", ColumnNames{"b", "a"}); add(s, {"1", "44"}); add(s, {"3", "77"}); Table* control = Database::new_table("control", ColumnNames{"x", "y"}); add(control, {"1", "44"}); add(control, {"1", "55"}); add(control, {"2", "66"}); add(control, {"3", "77"}); assert(table_eq(control, onion(r, s))); }

This creates two tables, r and s, both with columns a and b (in different orders). Rows are added to the tables. Note that the {"1", "44"} row is added to each. Next, a control table is created. This table contains the expected result. In this table, note that {"1", "44"} appears once, because the output of any relational algebra operation is a table, and tables do not have duplicates. A buggy implementation might have two occurrences, violating uniqueness, and this would cause the test to fail.

The last statement in the code compares the union of r and s with the control, and asserts that output from the union and the control should have the same rows.

Here is a typical negative test:
static void join_no_join_columns() { Table* r = Database::new_table("r", ColumnNames{"a", "b"}); Table* s = Database::new_table("s", ColumnNames{"c", "d"}); try { join(r, s); fail(); } catch (JoinException& e) { // expected } }

The test starts by creating a new table named r with columns named a and b, and a table s with columns c and d. Next, inside a try block, we try to join the tables. This should not succeed because there are no join columns (i.e., no columns with matching names in the two tables). The test expects that JoinException will be thrown. If it is not thrown, the test fails.

Makefile

A Makefile is provided. If you add .cpp or .h source, be sure to modify the Makefile appropriately.

Building the code

Unpack the archive, cd into the a2 directory, and then just run make: jao@mintyzack ~/teaching/tufts/115/a2/a2 $ make g++ -std=c++11 -Wall -Wno-unused-function -c ColumnNames.cpp -o ColumnNames.o g++ -std=c++11 -Wall -Wno-unused-function -c Database.cpp -o Database.o g++ -std=c++11 -Wall -Wno-unused-function -c RelationalAlgebra.cpp -o RelationalAlgebra.o g++ -std=c++11 -Wall -Wno-unused-function -c Row.cpp -o Row.o g++ -std=c++11 -Wall -Wno-unused-function -c RowCompare.cpp -o RowCompare.o g++ -std=c++11 -Wall -Wno-unused-function -c Table.cpp -o Table.o g++ -std=c++11 -Wall -Wno-unused-function -c main.cpp -o main.o g++ -std=c++11 -Wall -Wno-unused-function -c test.cpp -o test.o g++ -std=c++11 -Wall -Wno-unused-function -c unittest.cpp -o unittest.o g++ -std=c++11 -Wall -Wno-unused-function -c util.cpp -o util.o g++ -std=c++11 -Wall -Wno-unused-function ColumnNames.o Database.o RelationalAlgebra.o Row.o RowCompare.o Table.o main.o test.o unittest.o util.o -o a2

Running the code

If you run the code without modification, you should see this output:
jao@mintyzack ~/teaching/tufts/115/a2/a2 $ ./a2 :-( 1/36 -- union_incompatible: FAILED -- Not yet implemented! :-( 2/36 -- union_compatible_different_names: FAILED -- Not yet implemented! ... :-( 35/36 -- join_both_non_empty: FAILED -- Not yet implemented! :-( 36/36 -- join_both_non_empty_2: FAILED -- Not yet implemented! SUMMARY: 36 tests 0 passed 36 failed

When you successfully implement RelationalAlgebra, you should see this:

jao@mintyzack ~/teaching/tufts/115/a2/a2 $ ./a2 :-) 1/36 -- union_incompatible: ok :-) 2/36 -- union_compatible_different_names: ok ... :-) 35/36 -- join_both_non_empty: ok :-) 36/36 -- join_both_non_empty_2: ok SUMMARY: 36 tests 36 passed 0 failed

Use valgrind

As in Assignment 1, use valgrind frequently to rule out memory corruption and leaks.