Contents for Stat542 may vary from semester to semester, subject to change/revision at the instructor’s discretion. The contents below are from Spring 2019. UIUC students can access lecture videos [Here]. Please send your comments to liangf AT illinois DOT edu.
Index
[Week 0: Prerequisite] [Week 1: Introduction]
[Week 2: Linear Regression] [Week 3: Variable Selection and Regularization]
[Week 4: Regression Trees] [Week 5: Nonlinear Regression]
[Week 6: Clustering Analysis] [Week 7: Latent Structure Models]
[Week 9: Discriminant Analysis] [Week 10: Logistic Regression]
[Week 11: Support Vector Machine] [Week 12: Classification Trees]
[Week 13: Recommender System] [Week 14: Brief Introduction to Deep Learning]
ESL = Elements of Statistical Learning; ISLR = An Introduction to Statistical Learning
-
- Week 0: Prerequisite [Back_to_Index]
- Notes: [W0_Prerequisite_Stat425.pdf]
- Some beginning chapters from the [Deep Learning Book]
- Week 0: Prerequisite [Back_to_Index]
-
- Week 1: Introduction [Back_to_Index]
- Reading: chap 1, 2, 5 (ISLR); chap 1, 2.1-2.5, 13.3 (ESL); [ML Cheatsheet]
- Notes:
[W1.1_Introduction_Statistical_Learning.pdf]
[W1.2_kNN_vs_LinearRegression.pdf]
[W1.2_kNN_vs_LinearRegression_figs.pdf]
[W1.3_Introduction_LearningTheory.pdf] (Optimal) - Code:
- Contents:
- 1. Introduction to Statistical Learning
- 1.1 Types of statistical learning problems
- 1.2 Challenge of supervised learning
- 1.3 Curse of dimensionality [COD for Classification]
- 1.4 Bias and variance tradeoff
- A glimpse of learning theory (Optimal)
- 2. Least squares vs. nearest neighbors
- 1. Introduction to Statistical Learning
- Week 1: Introduction [Back_to_Index]
-
- Week 2: Linear Regression [Back_to_Index]
- Reading: chap 3 (ISLR); chap 3.1-3.2 (ESL)
- Notes:
[W2.1_LinearRegression_MLR.pdf]
[W2.2_LinearRegression_Geometry.pdf]
[W2.3_LinearRegression_Practice.pdf] - Code: W2_LinearRegression [Rcode] [Python_1] [Python_2]
- Contents:
- 1. Multiple linear regression
- 1.1 LS setup
- 1.2 LS principle
- 1.3 LS estimate
- 1.4 LS output
- 2. Geometric interpretation
- 2.1 Basic concepts in vector spaces
- 2.2 LS and projection
- 2.3 Properties of LS regression: R-square
- 2.4 Properties of LS regression: linear transformation
- 2.5 Properties of LS regression: rank deficiency
- 3. Practical issues
- 1. Multiple linear regression
- Week 2: Linear Regression [Back_to_Index]
-
- Week 3: Variable Selection and Regularization [Back_to_Index]
- Reading:
- chap 6, 10.2 (ISLR); chap 3, 7 (ESL);
- chap 2 (SLS) [SLS book in pdf]
- CMU notes on sparsity
- Notes:
[W3_VariableSelection.pdf]
[W3_More_on_Ridge_Lasso.pdf]
[W3_VariableSelection_appendix.pdf] (Optional)
[W3_PCR.pdf] - Code:
[Rcode_W3_VarSel_SubsetSelection.html]
[Rcode_W3_VarSel_RidgeLasso.html]
[Rcode_W3_VarSel_ToyExample.html]
[Rcode_W3_VarSel_DiabetesData.html]
[Glmnet Vignette (html)] - Contents:
- 1. Subset Selection
- 1.1 Introduction to subset selection
- 1.2 Selection criteria
- 1.3 AIC vs BIC
- 1.4 Search algorithms
- 1.5 Subset selection in R
- 1.6 Variable screening
- 2. Introduction to Regularization
- 3. Ridge Regression
- 3.1 Ridge regression
- 3.2 The shrinkage effect
- 3.3 Why shrinkage
- 3.4 Degree-of-freedom of ridge regression
- 3.5 Ridge regression in R
- 4. Lasso Regression
- 1. Subset Selection
- Reading:
- Week 3: Variable Selection and Regularization [Back_to_Index]
-
- Week 4: Regression Trees [Back_to_Index]
- Reading: chap 8.1-8.2 (ISLR); chap 9.2, 15 (ESL)
[A visual introduction to tree models] - Notes:
[W4.1_RegressionTree.pdf]
[W4.2_Regression_EnsembleTrees.pdf]
[W4.3_Regression_EnsembleTrees_Keynote.pdf] - Code:
[Rcode_W4_RegressionTree.html]
[Rcode_W4_Regression_RandomForest.html]
[Rcode_W4_Regression_GBM.html] [XGBoost] - Contents:
- 1. Regression Trees
- 1.1 Introduction to tree models
- 1.2 How to build a tree
- 1.3 Prune a tree
- 1.3.1 Complexity cost
- 1.3.2 The weakest link algorithm
- 1.3.3 Cross-validation
- 1.4 Regression tree in R
- 1.5 Handle categorical predictors
- 2. Random Forest
- 2.1 Introduction
- 2.2 Random forest in R
- 2.3 Out-of-bag samples
- 2.4 Variable importance
- 3. GBM
- 3.1 Boosting
- 3.2 GBM in R [GBM_Regression]
- 3.3 Discussion
- GBM is not longer actively maintained [link]. XGBoost is another choice for boosted tree models.
- 1. Regression Trees
- Reading: chap 8.1-8.2 (ISLR); chap 9.2, 15 (ESL)
- Week 4: Regression Trees [Back_to_Index]
-
- Week 5: Nonlinear Regression [Back_to_Index]
- Reading:
- chap 7 (ISLR), chap 5.2-5.5, 6.1, 9.1 (ESL)
- For gam (Generalized Additive Model), read chap 7.7 (ISLR) and check [GAM: The Predictive Modeling Silver Bullet] and
[Analyzing seasonal time series with GAM] - For mgcv, check [mgcv: GAMs in R] [mgcv]
- May discuss [Gaussian Processes] if time allows.
- Notes: [W5_NonlinearRegression.pdf]
- Code:
[Rcode_W5_PolynomialRegression.html]
[Rcode_W5_RegressionSpline.html]
[Rcode_W5_SmoothingSpline.html]
[Rcode_W5_LocalSmoother.html] - Contents:
- 1. Polynomial Regression
- 2. Regression Splines
- 2.1 Cubic Splines
- 2.2 Regression Splines
- 2.3 Regression Splines in R
- 2.4 Choice of Knots
- 2.5 Summary
- 3. Smoothing Splines
- 3.1 Smoothing Splines
- 3.2 Fit a Smoothing Spline Model
- 3.3 Smoothing Splines in R
- 3.4 Choice of Lambda
- 3.5 Summary
- 4. Local Regression
- Reading:
- Week 5: Nonlinear Regression [Back_to_Index]
-
- Week 6: Clustering Analysis [Back_to_Index]
- Reading: chap 10 (ISLR); chap 14 (ESL)
- Notes: [W6_Clustering.pdf]
- Code and R Packages:
- [Rcode_W6_Clustering.html]
- [https://uc-r.github.io/kmeans_clustering]
- R packages: [Kernlab] [fpc]
- [Visualizing K-means Clustering]
- Spectral clustering: [Tutorial@ICML04] [A blog]
- Contents:
- 1. Choice of Distance Measures
- 2. Multidimensional Scaling
- 3. K-means and K-medoids
- 3.1 The K-means Algorithm
- 3.2 Dimension Reduction
- 3.3 Other Distance Measures
- 3.4 The K-medoids Algorithm
- 4. Choice of K
- 5. Hierarchical Clustering
- 6. Vector Quantization
- Week 6: Clustering Analysis [Back_to_Index]
- Week 7: Latent Structure Models [Back_to_Index]
-
- Notes: [W7_Mixture_EM.pdf] [W7_HMM.pdf]
- Code and R Packages:
- [Rcode_W7_MixtureModel.html] [Rcode_W7_LDA.html] [Rcode_W7_HMM.html]
- [mclust]
- HMM packages: [hmm.discnp] [HiddenMarkov] [HMM] [depmixS4]
- Reading:
- “A view of the EM algorithm that justifies incremental, sparse, and other variants” by B. M. Neal and G. E. Hinton [pdf]
- “A tutorial on HMM”, by Lawrence R. Rabiner (1989) [pdf]
- “A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models” by Jeff A. Bilmes [pdf]
- Contents:
- 1. Introduction to Model-based Clustering
- 2. Mixture Models
- 3. A Simple Two-component Gaussian Mixture
- 3.1 Gaussian Mixtures
- 3.2 KL Distance
- 3.3 An Iterative Algorithm
- 4. The EM Algorithm
- 4.1 The EM Algorithm
- 4.2 Why It Works
- 4.3 Connection with K-means
- 4.4 Variational EM
- 5. The Latent Dirichlet Allocation Model
[Your Guide to Latent Dirichlet Allocation] - 6. Hidden Markov Models [youtube]
-
- Week 8: Break
- Week 9: Discriminant Analysis [Back_to_Index]
- Reading: chap 4 (ISLR); chap 4 (ESL)
- Notes:
[W9.1_DA] [W9.2_DA] [W9.3_DA] [W9.4_DA] - Code:
[Rcode_W9_LDA_QDA.html]
[Rcode_W9_NaiveBayes_disagreement.html] - Contents:
- Week 8: Break
-
- Week 10: Logistic Regression [Back_to_Index]
- Reading: chap 4.3 (ISLR); chap 4.4 (ESL)
[SGD for Logistic Regression] - Notes:
[W10.1_LogisticRegression.pdf]
[W10.2_LogisticRegression.pdf]
[W10.3_LogisticRegression.pdf] - Code:
[Rcode_W10_LogisticReg.html]
[Rcode_W10_LogisticReg_Nonlinear.html] - Contents:
- 1. Setup
- 2. MLE
- 3. Separable Data
- 4. Logistic Regression in R
- 5. Variable Selection
- 6. Retrospective Sampling Data
- 7. Nonlinear Logistic Regression
- GAM: The Predictive Modeling Silver Bullet
- Reading: chap 4.3 (ISLR); chap 4.4 (ESL)
- Week 10: Logistic Regression [Back_to_Index]
-
- Week 11: Support Vector Machine [Back_to_Index]
- Reading: chap 9 (ISLR); chap 12.2-12.3 (ESL)
- Notes:
[W11_SVM_Nontechnical-Intro.pdf]
[W11.1_SVM.pdf]
[W11.2_SVM.pdf]
[W11.3_SVM.pdf]
[W11.4_SVM.pdf]
[W11_RKHS.pdf] (Optional) - R packages:
[klaR] [kernlab] [e1071] [LibLinearR] [LiquidSVM] [SVM-Light]
Review paper: Support vector machines in R (2006) - Contents:
- 1. Linear SVM for separable case
- 1.1 The max margin problem
- 1.2 KKT conditions
- 1.3 Primal and dual relationship
- 1.4 Summary for the separable case
- 2. The non-separable case
- 2.1 How to handle non-separable data
- 2.2 Soft-margin SVM
- 3. Practical Issues
- 3.1 From binary decision to probabilities
- 3.2 Multi-class SVMs
- 4. Nonlinear SVMs
- 5. Summary
- 1. Linear SVM for separable case
- Week 11: Support Vector Machine [Back_to_Index]
-
- Week 12: Classification Trees [Back_to_Index]
- Reading: chap 8 (ILSR); chap 9.2, 10, 15 (ESL)
- Notes:
[W12_ClassificationTree_Boosting.pdf]
[W12_LinearClassifiers_Comparison.pdf] - Code:
- [Rcode_W12_ClassificationTree.html]
- [Rcode_W12_Classification_RandomForest.html]
- [Rcode_W12_old_boost.html]: Due to the recent update of GBM package, I cannot reproduce the result for Toy Example II. Old output is in file [Rcode_W12_old_boost_CVouput.pdf]
- R packages: [XGBoost] [lightGBM] [CatBoost] [gbm3]
- Introduction on Boosted Trees [slide] by the original author of XGBoost
- Contents:
- Week 12: Classification Trees [Back_to_Index]
-
- Week 13: Recommender System [Back_to_Index]
- Reading:
- Notes from Stanford: Mining Mass Datasets [pdf]
- The BellKor Solution to the Netflix Grand Prize [pdf]
- Winning the Netflix Prize: A Summary [blog]
- Amazon.com Recommendations Item-to-Item Collaborative Filtering [pdf]
- The Netflix Recommender System: Algorithms, Business Value, and Innovation [pdf]
- Factorization machines [link]
- Google’s wide and deep model [link]
- Deep Neural Networks for YouTube Recommendations [link]
- Notes: [W13_RecommenderSystem.pdf]
- Code and R Packages:
- [Rcode_W13_RS.html] [recommenderlab]
- Recommender systems 101 [link]
- Recommendation system in R [link]
- Building a movie recommendation engine with R [link]
- Using R package, recommenderlab, for predicting ratings for MovieLens data: https://ashokharnal.wordpress.com/2014/12/18/using-recommenderlab-for-predicting-ratings-for-movielens-data/
- Explore movie ratings
https://www.kaggle.com/ekim01/d/grouplens/movielens-20m-dataset/explore-movie-ratings
- Contents:
- Reading:
- Week 13: Recommender System [Back_to_Index]
- Week 14: Brief Introduction to Deep Learning [Back_to_Index]
- Reading:
- a web book on deep learning by Alex Smola and others at Amazon:
http://en.diveintodeeplearning.org/ - Deep Learning book by Goodfellow et al.
https://www.deeplearningbook.org/ - [https://keras.io] [https://keras.rstudio.com]
- An overview of gradient descent optimization algorithms
- a web book on deep learning by Alex Smola and others at Amazon:
- Notes: [lec_W14_DeepLearning]
- Code:
- Use GPU on AWS [http://www.louisaslett.com/RStudio_AMI]
- [Rcode_W14_NN_FashionMnist.html] — FashionMnist image data
- [Rcode_W14_NN_MovieReview.html] — Sentiment Analysis on Movie Review using one-hot encoding and word embedding
- [Rcode_W14_NN_RNN_LSTM.html] — Sentiment Analysis on Movie Review using RNN and LSTM
- [Rcode_W14_NN_CNN.html] — Cifa10 image classification using CNN
- [Rcode_W14_NN_VAE.html] — VAE on MNIST data
- Reading:
-