Trask A.W. Grokking Deep Learning

Файл формата djvu
размером 2,91 МБ

Добавлен пользователем Евгений Машеров 09.04.2020 23:10
Описание отредактировано 15.03.2024 20:14

Shelter Island: Manning, 2019. — 311 p.

About this Book

Welcome to
Why you should learn deep learning
Why you should read this book
What you need to get started

Fundamental Concepts
What is deep learning?
What is machine learning?
Supervised machine learning
Unsupervised machine learning
Parametric vs nonparametric learning
Supervised parametric learning
Unsupervised parametric learning
Nonparametric learning

Forward Propagation
Step : Predict
A simple neural network making a prediction
What is a neural network?
What does this neural network do?
Making a prediction with multiple inputs
Multiple inputs: What does this neural network do?
Multiple inputs: Complete runnable code
Making a prediction with multiple outputs
Predicting with multiple inputs and outputs
Multiple inputs and outputs: How does it work?
Predicting on predictions
A quick primer on NumPy

Gradient Descent
Compare
Learn
Compare: Does your network make good predictions?
Why measure error?
What’s the simplest form of neural learning?
Hot and cold learning
Characteristics of hot and cold learning
Calculating both direction and amount from error
One iteration of gradient descent
Learning is just reducing error
Let’s watch several steps of learning
Why does this work? What is weight_delta, really?
Tunnel vision on one concept
A box with rods poking out of it
Derivatives: Take two
What you don’t really need to know
How to use a derivative to learn
Look familiar?
Breaking gradient descent
Visualizing the overcorrections
Divergence
Introducing alpha
Alpha in code
Memorizing
Generalizing Gradient Descent
Gradient descent learning with multiple inputs
Gradient descent with multiple inputs explained
Let’s watch several steps of learning
Freezing one weight: What does it do?
Gradient descent learning with multiple outputs
Gradient descent with multiple inputs and outputs
What do these weights learn?
Visualizing weight values
Visualizing dot products (weighted sums)

Intro to Backpropagation
The streetlight problem
Preparing the data
Matrices and the matrix relationship
Creating a matrix or two in Python
Building a neural network
Learning the whole dataset
Full, batch, and stochastic gradient descent
Neural networks learn correlation
Up and down pressure
Edge case: Overfittin
Edge case: Confliting pressure
Learning indirect correlation
Creating correlation
Stacking neural networks: A review
Backpropagation: Long-distance error attribution
Backpropagation: Why does this work?
Linear vs nonlinear
Why the neural network still doesn’t work
The secret to sometimes correlation
A quick break
Your first deep neual network
Backpropagation in code
One iteration of backpropagation
Putting it all together
Why do deep networks matter?
Picture NNs
It’s time to simplify
Correlation summarization
The previously overcomplicated visualization
The simplified visualiztion
Simplifying even further
Let’s see this network predict
Visualizing using letters instead of pictures
Linking the variables
Everything side by side
The importance of visualization tools
Intro to Regularization & Batching
Three-layer network on MNIST
Well, that was easy
Memorization vs generalization
Overfitting in neual networks
Where overfitting omes from
The simplest regularization: Early stopping
Industry standard regularization: Dropout
Why dropout works: Ensembling works
Dropout in code
Dropout evaluated on MNIST
Batch gradient descent

Activation Functions
What is an activation function?
Standard hidden-layer activation functions
Standard output layer activation functions
The core issue: Inputs have similarity
softmax computation
Activation installation instructions
Multiplying delta by the slope
Converting output to slope (derivative)
Upgrading the MNIST network
Intro to Convolutional NNs
Reusing weights in multiple places
The convolutional layer
A simple implementation in NumPy

NNs that understand Language
What does it mean to understand language?
Natural language processing (NLP)
Supervised NLP
IMDB movie reviews dataset
Capturing word correlation in input data
Predicting movie reviews
Intro to an embedding layer
Interpreting the output
Neural architecture
Comparing word embeddings
What is the meaning of a neuron?
Filling in the blank
Meaning is derived from loss
King – Man + Woman ~= Queen
Word analogies

Recurrent Layers for Variable-Length Data
The challenge of arbitrary length
Do comparisons really matter?
The surprising power of averaged word vectors
How is information stored in these embeddings?
How does a neural network use embeddings?
The limitations of bag-of-words vectors
Using identity vectors to sum word embeddings
Matrices that change absolutely nothing
Learning the transition matrices
Learning to create useful sentence vectors
Forward propagation in Python
How do you backpropagate into this?
Let’s train it!
Setting things up
Forward propagation with arbitrary length
Backpropagation with arbitrary length
Weight update with arbitrary length
Execution and output analysis

Deep Learning Framework
What is a deep learning framework?
Introduction to tensors
Introduction to automatic gradient computation (autograd)
A quick checkpoint
Tensors that are used multiple times
Upgrading autograd to support multiuse tensors
How does addition backpropagation work?
Adding support for negation
Adding support for additional functions
Using autograd to train a neural network
Adding automatic optimization
Adding support for layer types
Layers that contain layers
Loss-function layers
How to learn a framework
Nonlinearity layers
The embedding layer
Adding indexing to autograd
The embedding layer (revisited)
The cross-entropy layer
The recurrent neural network layer

Long Short-Term Memory
Character language modeling
The need for truncated backpropagation
Truncated backpropagation
A sample of the output
Vanishing and exploding gradients
A toy example of RNN backpropagation
Long short-term memory (LSTM) cells
Some intuition about LSTM gates
The long short-term memory layer
Upgrading the character language model
Training the LSTM character language model
Tuning the LSTM character language model

Federated Learning
The problem of privacy in deep learning
Federated learning
Learning to detect spam
Let’s make it federated
Hacking into federated learning
Secure aggregation
Homomorphic encryption
Homomorphically encrypted federated learning

Where to go from here