Enable JS to Render.
Processing math: 17%

Deep Learning for Beginners

Notes for "Deep Learning" by Ian Goodfellow, Yoshua Bengio, and Aaron Courville.

Machine Learning

Generalization and Overfitting.

Feedforward Networks

Feedforward networks represents y=f(x) with a function family: u=f(x;θ)

Designing the Output Layer.

The most common output layer is: f(x;M,b)=g(Mx+b)

Finding θ

Find θ by solving the following optimization problem for J the cost function:

min

Choosing the Cost Function

Regularization

Deep Feedforward Networks

Deep feedforward networks instead use: u = f(x, \theta) = f^N(\ldots f^1(x; \theta^1) \ldots; \theta^N)

Designing Hidden Layers.

The most common hidden layer is: f^n(x) = g(Mx+b)

Optimizaton Methods

Simplifying the Network

Convolution Networks

A convolutional network simplifies some layers by using convolution instead of matrix multiply (denoted with a star): f^n(x) = g(\theta^n \ast x)

Pooling

A common layer used in unison with convnets is max pooling: f^n(x)_i = \max_{j \in G(i)} x_j

Recurrent Networks

Recurrent networks use previous outputs as inputs, forming a recurrence: s^{(t)} = f(s^{(t-1)}, x^{(t)}; \theta)

Useful Data Sets

Autoencoders

An autoencoder has two functions, which encode f and decode g from input space to representation space. The objective is: J = L(x, g(f(x)))

Representation Learning

The idea is that instead of optimizing u = f(x; \theta) , we optimize: u = r_o(f(r_i(x); \theta))

Practical Advice

Appendix: Probability