BakIr, G. et al. (2007). Predicting structured data. Cambridge, Mass.: MIT Press.
Bishop, C. M. (2006). Pattern recognition and machine learning. New York: Springer.
Chapelle, O., Schoelkopf, B., and Zien, A. (2006). Semi-supervised learning. Cambridge, Mass.: MIT Press.
Charniak, E. (1993). Statistical language learning. Cambridge, Mass.: MIT Press.
w
Cristianini, N. and Shawe-Taylor, J. (2000). An introduction to support vector machines: and other kernel-based learning methods. Cambridge: Cambridge University Press.
Darwiche, A. (2009). Modeling and reasoning with Bayesian networks. Cambridge: Cambridge University Press.
Duda, R. O., Hart, P. E., and Stork, D. G. (2001). Pattern classification. New York: Wiley.
Getoor, L. and Taskar, B. (2007). Introduction to statistical relational learning. Cambridge, Mass.: MIT Press.
Jensen, F. V. and Nielsen, T. D. (2007). Bayesian networks and decision graphs. New York: Springer.
MacKay, D. J. C. (2003). Information theory, inference, and learning algorithms. Cambridge, UK: Cambridge University Press.
Manning, C. D., Raghavan, P., and Schuetze, H. (2008). Introduction to information retrieval. New York: Cambridge University Press.
Mitchell, T. M. (1997). Machine Learning. New York: McGraw-Hill.
Neapolitan, R. E. (2004). Learning Bayesian networks. Upper Saddle River, NJ: Pearson Prentice Hall.
Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian processes for machine learning. Cambridge, Mass.: MIT Press.
Schoelkopf, B. and Smola, A. J. (2002). Learning with kernels: support vector machines, regularization, optimization, and beyond. Cambridge, Mass.: MIT Press.
Shawe-Taylor, J. and Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge, UK: Cambridge University Press.
Vapnik, V. N. (2000). The nature of statistical learning theory. New York: Springer.
Wasserman, L. (2006). All of nonparametric statistics. New York: Springer.
Wasserman, L. (2004). All of statistics: a concise course in statistical inference. New York: Springer.
Witten, I. H. and Frank, E. (2005). Data mining: practical machine learning tools and techniques. Amsterdam: Morgan Kaufman.
Zhu, X. and Goldberg, A. B. (2009). Introduction to Semi-Supervised Learning. Morgan and Claypool Publishers.
Learning Objectives
You will learn about several fundamental and some advanced algorithms for statistical learning, you will know the basics of computational learning theory, and will be able to design state-of-the-art solutions to application problems
Prerequisites
A good knowledge of a programming language, and a solid background in mathematics (calculus, linear algebra, and probability theory) are necessary prerequisites to this course. Previous knowledge of optimization techniques and statistics would be useful but not strictly necessary
There is a single oral final exam. You can choose the exam topic but you are strongly advised to discuss with me before you begin working on it.
Typically, you will be assigned a set of papers to read and you may be asked to reproduce some experimental results.
You will be required to give a short (30 min) presentation during the exam. Please ensure that your presentation includes an introduction to the problem being addressed, a brief review of relevant literature, technical derivation of methods, and, if appropriate, a detailed description of experimental work. You are allowed to use multimedia tools to prepare your presentation. You are responsible for understanding all the relevant concepts and the underlying theory.
You can work in groups of two to carry out experimental works (three is an exceptional number that you must motivate clearly). If you do so, please ensure that personal contributions to the overall work are clearly identifiable.