# Michael's Wiki

#### General Supervised Model

• * Learner
• Contains the “model”
• * Evaluation or cost
• based on output of Learner
• * Algorithm
• based on output of cost function
• tweaks parameters of the learner

### Types of Models

• * (Regression used throughout)
• * Nearest Neighbor
• * Decision Trees
• * Probability / Bayes classifiers
• * Linear Models / perceptronsUnsupervised Learni
• Perceptron algorithm
• Logistic MSE
• SVM w/ hinge loss
• * Neural Networks
• * Over/Under -fitting & Complexity
• Features
• Creation*
• Selection
• “Kernel” methods
• Data
• Size of test/train sets
• Other Methods to control complexity
• Regularization
• Early stopping
• Model parameters
• Bagging
• Boosting
• Measure complexity
• VC-Dimension
• Shattering
• Measure Over/Under -fitting*
• Use holdout data to perform validation tests (Cross-validation)
• Gives independent estimate of the model performance
• Allows selection/optimization of model parameters
• Examine position in the test/train error curves
• Gives hint about over- or under-fitting

#### Unsupervised Learning

Unsupervised Learning Used to “understand” the data. Creates a new simplified representation of the data.

### Clustering

Clustering Output a map $Z^i$ that assigns a data point $X^i$ to one of $k$ clusters.

##### Hypothesis Testing

Want to add a standard of rigor to testing. $x \to f(x) \to err(f)$ $x \to g(x) \to err(g)$ What is the confidence in each error result?

• * Estimate mean and variance of $err(x)$

Test $f < g$

• * Null hypothesis: $H_0: f = g$
• * Alt. hypothesis: $H_1: f < g$

Compute a “score” that is large if $H_1$ is true, but is low if $H_0$ is true.

• * $p-value: Pr(s > observed score | H_0)$
• NOT the probability that $H_0$ or $H_1$ are true

Perform tests. Repeatedly, is $err(f) < err(g)$? Then you can use a confidence test such as Student's t-test to estimate the expectation of these tests. The data used to perform these tests must not have been used in the training process for either model.

##### Eigendecomposition and SVD

$\Sigma = V \Lambda V'$

$X = USV'$

$X'X = VS'U'USV'$

$X'X = V (S^2) V'$

##### Bias and Variance 