150ML: Information for first class exam 10/25/05
The exam will aim to test whether you have grasped the main concepts,
problems, ideas, and algorithms we have covered, and intuition behind
them. Generally speaking, the exam will not test your technical
wizardry with equations or long calculations. But you should be able
to explain the main idea in a calculation or derivation and/or do
small examples.
The material includes everything discussed in class, and the textbook
reading up to 10/18.
One exception is that our brief discussion of
kernel methods is not included in the first exam.
For topics not in the text I do not expect you to know every detail in
the papers but I expect that you know the portions discussed in class.
Here is a list of
topics we covered in class:
-
Machine learning problems:
Supervised, unsupervised and reinforcement learning.
-
Specifying a machine learning problem.
-
Decision Tree representation, basic learning algorithm, information
gain heuristic for choosing node functions,
other criteria and the intuition behind them,
the gain ratio heuristic.
Pruning: why? how? alternatives?
Real valued attributes. Missing values.
Skewing: when is it useful? how to do it.
-
Version spaces. Generalization/Specialization.
Most specific/general sets.
What they are and how they can be used.
-
Neural networks. Basic unit representation, examples as constraints on
weights. The GD, EG, EG+/- algorithms and their performance.
How to use on-line algorithms in train-set/test-set scenario.
Longest survivor and voted perceptron.
Gradient descent for simple functions and for neural networks. The
back-propagation algorithm. Generalization/performance issues.
-
How to evaluate: accuracy, error rate, precision, recall, false
positive, ...
-
Confidence intervals. Basic ideas underlying these and
how they transpire in the normal distribution.
What kinds of intervals we have; what they mean; how to use them in
machine learning contexts.
Training set/validation set split.
(Stratified) k-fold cross validation.