|
Welcome to the COMP150DW Web Page
Spring 2004
CONTACT || ANNOUNCEMENTS || LECTURES || HOMEWORK || LINKS
Announcements
o Assignments 1-3 grades are now available via 'progress'
o The command psql -h pgsql may not be working. To access pgsql type (from
sun or moon): ssh pgsql
When prompted, enter your password, and when you get the command prompt,
type: psql
o There is a mailing list for the class. If you have not received messages
from it, please email majordomo@cs.tufts.edu with the command 'subscribe
comp150dw' in the body of the message.
Contact Information
Teacher: Professor Daniel Hebert - dhebert@mitre.org
TA: Kate Butler - kbutlr@cs.tufts.edu
Office Hours: Monday 3-4pm, Wednesday 4-5pm (Halligan 118)
Homework
HW1 ||
HW2 ||
HW3 ||
PROJECT ||
GENERAL INFO
Project :: DUE MAY 13
o Informational document and zip file located here
o Submit with 'provide comp150dw project'
o Note: All data must be entered into the tables. However, the program
need not use it all.
o Please use 'provide' for as much of your information as possible: This
includes .doc or .ppt files. If you turn something into my mailbox it MUST
be there on Friday morning when I come to pick it up -- the projects will
be graded that weekend and I won't be able to get at the office after
Friday
Deliverables:
o Your Data Warehouse schema
o ALL data in your tables
o The program code
o A readme file with instructions on how to compile, run, etc., plus any
information you feel might be helpful in addition.
Homework 3 :: DUE APRIL 22 BEFORE CLASS
o This homework assignment will utilize the data warehouse you previously
built incorporating the hurricane data
o Implement an automated Attribute-Oriented Induction capability on your
hurricane data
o Input needed :
- Your relation tables for the hurricane data
- A DMQuery for characterization
- A list of attributes
- A set of concept hierarchies or generalization operators
- A generalized relation threshold
- Attribute generalization thresholds for each attribute
o Transform the DMQL statement to a relational query (can do this by hand)
o Use this relational query in your program to retrieve data and then
perform the AOI (see algorithm on pg. 188 of book)
o Visualize the results of the AOI via a generalized relation
Deliverables:
o The DMQL statement
o The program code
o A readme file with instructions on how to compile, run, etc., plus any
information you feel might be helpful in addition.
Hints:
o This assignment requires just one query.
o After performing the AOI the result can be output in a table. You don't
need to output the results from the original query -- just the results
after the AOI.
o Even if your HW2 tables had some problems, you probably don't need to
redo them to finish this assignment. The data won't be regraded.
o Input does not mean literal input to your program. It just means things
you need to be aware of before you jump in to writing it.
o Interactivity is not a requirement.
Homework 2 :: DUE MARCH 18 BEFORE CLASS
o Design a data warehouse based on the hurricane data provided (three years
of data). Two data sources with slightly different formats and
data
o Utilize a star or snowflake schema
o Implement the data warehouse utilizing relational tables in postgres
o Integrate the data from the two separate data sources and import into
your data warehouse
o Keep the data
warehouse as you will utilize it in your next homework
assignment
Hints
o There is no one right answer to this assignment.
Deliverables
o The schema for the data warehouse.
o The tables themselves (in PostGres).
Homework 1 :: DUE MARCH 11 BEFORE CLASS
o Implement a Data Warehouse using a star schema for the sales example in
Chapter 2 page 49
o Import the following data into your database. SalesFact.txt,
BranchDimension.txt, ItemDimension.txt, LocationDimension.txt,
TimeDimension.txt (data files)
o Use “\COPY” command in postgresql to copy data
o Implement a program that allows you to do the following :
- roll-up on total sales by year and by year/quarter
- roll-up on total sales by country, by state, and by city
- roll-up on total sales by item brand and by item type (digital or
analog)
- drill down on total sales by month and by day
- drill down on total sales by item name
- drill down on total sales by street address
o Have a simple user interface to allow activation of each of the
functions
o Use SQL statements to implement each roll-up, drill-down
Hints
o Example - Run drill down on total sales by street address in Query
Analyzer:
select sum(s.dollarssold) as totalsales, l.street from SalesFactTable s,
LocationDimensionTable l where l.locationkey = s.locationkey group by
l.street
o You can run psql from any of the unix machines with: psql -h pgsql
o \?, \h and \h [command] in psql will provide information
o Here's an example of a c program which interfaces with psql: test1.c. To compile it, log into pgsql and use gcc -lpq.
o The syntax of the copy command is: \copy [table name] from '[file name]'
using delimiters '[delimiter]'
o If you're getting errors running PostGres from a remote host, try
logging in to pgsql directly.
How to turn in your assignments
To turn in your postgres assignments type the following:
provide comp150dw hwX filenames
where filenames are the names of the files you want to submit, and X is
the number of the assignment.
Start with X = 1
You must always submit all your code and executables.
How to check your grades on assignments
To check your grades for your assignments, type:
progress comp150dw
Remember
To get credit for your assignments, you need to grant the TA permission to
access your tables. This can be done with the following psql command:
'grant all on table to kbutlr;'
where table is a comma seperated list of your tables' names
Lecture Slides
Lecture slides are in MS PowerPoint format.
Lectures
PostgreSQL Stuff
You can find the Web page for PostgreSQL here.
|