Schedule

Introduction slides; assignment 01

This lectures introduces machine learning, its applications, and the kind of problems it can be applied. Likewise, it presents related concepts such as supervised and unsupervised learning, and generalization.
Linear and logistic regression lecture notes; lab 01: introduction, lab 02: regression; assignment 02

This lecture introduces parametric approaches to supervised learning and linear models. Linear regressors are expressed as maximum likelihood estimation problem. The discussed concepts include: (a) parametric methods, (b) maximum likelihood estimates, (c) linear regression, and (d) logistic regression.

References:
- Tsanas A, Xifara A. Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and Buildings. 2012 Jun 1;49:560-7.
- Linear Models
Dimension reduction slides; lab 03: dimension reduction

This lectures discusses how to tackle high-dimensional learning problems, and how to reduce dimension through the principal component analysis (PCA) method.

References:
Model evaluation and selection slides; lab 04: model assessment

This lecture discusses model assessment of supervised machine learning. The discussed topics include: (a) training and test sets, (b) cross-validation, (c) bootstrap, (d) metrics of model complexity, and (e) metrics of performance for classification and regression.
Regularized linear regression and nearest-neighbors methods slides lab 05: kNN

This lecture introduces the concept of regularization as a means to controlling the complexity of the hypothesis space, and apply it to linear models. Furthermore, non-parametric methods are illustrated with the nearest-neighbors approaches. The discussed topics are: lasso, ridge regression, structured regularization, non-parametric learning, and k-nearest neighbors.

References:
- Linear Models
- Nearest Neighbors
- The Elements of Statistical Learning: Data Mining, Inference, and Prediction
  - Ridge regression: session 3.4.1
  - Lasso: session 3.4.2
  - Regularization: session 10.12
Tree-based methods slides; lab 06.1: decision trees; lab 06.2: tree-based methods

This lecture discusses decision tree approaches and shows how to combine simple classifiers to yield state-of-the-art predictors.

References:
- Quinlan, J. Ross. Induction of decision trees. Machine Learning 1, no. 1 (1986): 81-106
- Breiman, Leo. Random forests Machine Learning 45, no. 1 (2001): 5-32
- Schapire, Robert E. The strength of weak learnability. Machine learning 5, no. 2 (1990): 197-227
- Breiman, Leo. Bagging predictors. Machine Learning 24, no. 2 (1996): 123-140.
- An Intoductory Tutorial on kd-trees
- Voronoi Tessellation
- A complete tutorial on tree-based modeling
- How to visualize decision trees
Support vector machines

This lecture introduces support-vector machine from its principles in the case of linearly separable data and shows how positive-definite kernels can be used to extend the approach to non-linear separating functions.

References:
- Burges, Christopher JC. A tutorial on support vector machines for pattern recognition. Data mining and knowledge discovery 2, no. 2 (1998): 121-167
Clustering

This lecture introduces clustering, the common unsupervised learning problem. Its concepts are illustrated through hierarchical clustering, k-means, and DBSCAN.
Practical Exam Jupyter Notebook