Michael Data

Unsupervised Learning

Compared to Supervised Learning, we have data, but no class labels.

Frequently used as pre-processing for later tasks. e.g. use Dimensionality Reduction before visualization or learning from data.

Sparse Over-complete

A data representation that induces sparsity into node responses across features. Where a representation that uses the same number of important dimensions in data would be “complete,” this method uses more than that. Training then enforces a “sparsity” quality on internal node activations over time. The resulting model may not appear sparse, but the

Similar to an autoencoder, but with a “sparsifying logistic” as a nonlinear transfer function on the internal layer.

Difference between this and a standard neural network is that with the sparsified internal layer, you can perform a “deterministic EM” to alternately optimize weights and the internal nodes.

The resulting weights were successfully used to initialize a deep-belief network for good performance on MNIST.

See http://books.nips.cc/papers/files/nips19/NIPS2006_0804.pdf