Machine learning is the science of collecting and analysing data and turning it into predictions, encapsulated knowledge, or actions. There are many ways and scenarios by which data can be obtained, many different models and algorithms for data analysis, and many potential applications. In recent years machine learning has attracted attention due to commercial successes and widespread use.
The course gives a broad introduction to machine learning aimed
at upper level undergraduates and beginning graduate students.
Some mathematical aptitude is required, but generally speaking we focus on
baseline algorithms, practical aspects, and breadth and leave detailed analysis to more advanced corses:
Statistical Pattern Recognition
,
Computational Learning Theory
,
Learning, Planning and Acting in Complex Environments
.
Syllabus:
An overview of methods whereby computers can learn from data or experience and make decisions accordingly. Topics include supervised learning, unsupervised learning, reinforcement learning, and knowledge extraction from large databases with applications to science, engineering, and medicine.
Prerequisites:
Comp 15 and COMP/MATH 22 or 61 or consent of instructor. (Comp 160 is highly recommended).
Class Times:
Monday and Wednesday, 4:30-5:45, Halligan Hall 111A
You may
discuss the problems and general ideas about their solutions with
other students, and similarly you may consult other textbooks or the
web. However, you must work out the details on your own and code/write-out the
solution on your own. Every such collaboration (either getting help of
giving help) and every use of text or electronic sources must be
clearly cited and acknowledged in the submitted homework. Failure to
follow these guidelines may result in disciplinary action for all
parties involved. Any questions? for this and other issues concerning
academic integrity please consult the
booklet available from the office of the Dean
of Student Affairs.
Tentative List of Topics
[We may not cover all sub-topics]
Supervised Learning Basics:
decision trees,
rules, nearest neighbors, Perceptrons, logistic regression, and neural networks.
Feature processing and selection, and experimental evaluation.
Learning with generative probabilistic models:
maximum likelihood; Bayesian prediction; naive Bayes algorithm.
Unsupervised and semi-supervised learning:
clustering algorithms;
generative probabilistic models;
the EM algorithm; semi supervised learning
Additional topics:
kernel methods and support vector machines;
aggregation methods and boosting;
active Learning;
using relations among examples (relational learning; network analysis);
reinforcement learning,
computational learning theory.
Textbooks and Material Covered
No single text covers all the material for this course at the right level.
We have the following texts on reserve in the library. Other material will be selected from research and survey articles or other notes.
Detailed reading assignments and links to material will be posted.
[M]: Machine Learning. Tom M. Mitchell, McGraw-Hill, 1997
[CST]: An introduction to support vector machines : and other kernel-based
learning methods.
N. Cristianini and J. Shawe-Taylor.
Cambridge University Press, 2000.
[WF]: Ian H. Witten, Eibe Frank.
Data Mining: Practical Machine Learning Tools and Techniques.
2nd Edition, 2005.
[Describes algorithms and background on the weka system]
[F]
Machine Learning: The Art and Science of Algorithms that Make Sense of Data,
Peter Flach, Cambridge University Press, 2012.
[A]
Introduction to Machine Learning, Second Edition, by Ethem Alpaydin, MIT Press, 2010.
[DHS]: Pattern Classification (2nd edition), by R. Duda, P. Hart, and
D. Stork, John Wiley & Sons, 2001.
[RN]:
Artificial Intelligence: A Modern Approach, 3rd edition
Stuart Russell Peter Norvig, Publisher: Prentice Hall, 2010
Software
Reading, References, and Assignments
Topic
Reading/Assignments
Due Date
Introduction to Machine Learning
Read the introductory chapter of [M], [WF], [F] or [A]