Final Exam Notes
General Supervised Model
* Learner
Contains the “model”
* Evaluation or cost
based on output of Learner
* Algorithm
based on output of cost function
tweaks parameters of the learner
Types of Models
* (Regression used throughout)
* Nearest Neighbor
* Decision Trees
* Probability / Bayes classifiers
* Linear Models / perceptronsUnsupervised Learni
Perceptron algorithm
Logistic MSE
SVM w/ hinge loss
* Neural Networks
* Over/Under -fitting & Complexity
Features
Creation*
Selection
“Kernel” methods
Data
Size of test/train sets
Other Methods to control complexity
Regularization
Early stopping
Model parameters
Bagging
Boosting
Measure complexity
VC-Dimension
Shattering
Measure Over/Under -fitting*
Use holdout data to perform validation tests (Cross-validation)
Gives independent estimate of the model performance
Allows selection/optimization of model parameters
Examine position in the test/train error curves
Gives hint about over- or under-fitting
Unsupervised Learning
Unsupervised Learning
Used to “understand” the data. Creates a new simplified representation of the data.
Clustering
Clustering
Output a map $Z^i$ that assigns a data point $X^i$ to one of $k$ clusters.
Hypothesis Testing
Want to add a standard of rigor to testing.
$ x \to f(x) \to err(f) $
$ x \to g(x) \to err(g) $
What is the confidence in each error result?
Test $f < g$
Compute a “score” that is large if $H_1$ is true, but is low if $H_0$ is true.
Perform tests. Repeatedly, is $err(f) < err(g)$? Then you can use a confidence test such as Student's t-test to estimate the expectation of these tests. The data used to perform these tests must not have been used in the training process for either model.
Eigendecomposition and SVD
$\Sigma = V \Lambda V'$
$X = USV'$
$X'X = VS'U'USV'$
$X'X = V (S^2) V' $
Ensemble Methods
Bias and Variance