Neural Networks

Содержание

Слайд 2

Pachshenko
Galina Nikolaevna
Associate Professor of Information System Department,
Candidate

Pachshenko Galina Nikolaevna Associate Professor of Information System Department, Candidate of Technical Science
of Technical Science

Слайд 3

Week 4
Lecture 4

Week 4 Lecture 4

Слайд 4

Topics

Single-layer neural networks
Multi-layer neural networks
Single perceptron
Multi-layer perceptron
Hebbian Learning Rule
Back propagation
Delta-rule
Weight adjustment
Cost Function
Сlassification

Topics Single-layer neural networks Multi-layer neural networks Single perceptron Multi-layer perceptron Hebbian
(Independent Work)

Слайд 5

 Single-layer neural networks

Single-layer neural networks

Слайд 6

Multi-layer neural networks

Multi-layer neural networks

Слайд 7

Single perceptron

The perceptron computes a single output from multiple real-valued inputs by forming a linear combination

Single perceptron The perceptron computes a single output from multiple real-valued inputs
according to its input weights and then possibly putting the output through activation function. 

Слайд 8

Single perceptron. Mathematically this can be written as

Single perceptron. Mathematically this can be written as

Слайд 9

Single perceptron.

Single perceptron.

Слайд 10

Task 1:
Write a program that finds output of a single perceptron.
Note:
 Use bias.

Task 1: Write a program that finds output of a single perceptron.
The bias shifts the decision boundary away from the origin and does not depend on any input value.

Слайд 11

Multilayer perceptron

A multilayer perceptron (MLP) is a class of feedforward artificial neural network.

Multilayer perceptron A multilayer perceptron (MLP) is a class of feedforward artificial neural network.

Слайд 12

Multilayer perceptron

Multilayer perceptron

Слайд 13

Structure

• nodes that are no target of any connection are called input

Structure • nodes that are no target of any connection are called input neurons.
neurons.

Слайд 14

• nodes that are no source of any connection are called output

• nodes that are no source of any connection are called output
neurons.
A MLP can have more than one output neuron.
The number of output neurons depends on the way the target values (desired values) of the training patterns are described.

Слайд 15

• all nodes that are neither input neurons nor output neurons are

• all nodes that are neither input neurons nor output neurons are
called hidden neurons.
• all neurons can be organized in layers, with the set of input layers being the first layer.

Слайд 16

The original Rosenblatt's perceptron used a Heaviside step function as the activation

The original Rosenblatt's perceptron used a Heaviside step function as the activation function.
function.

Слайд 17

Nowadays, in multilayer networks, the activation function is often chosen to be

Nowadays, in multilayer networks, the activation function is often chosen to be the sigmoid function
the sigmoid function

Слайд 18

or the hyperbolic tangent

or the hyperbolic tangent

Слайд 19

They are related by

They are related by

Слайд 20

These functions are used because they are mathematically convenient.

These functions are used because they are mathematically convenient.

Слайд 21

An MLP consists of at least three layers of nodes.
Except for the

An MLP consists of at least three layers of nodes. Except for
input nodes, each node is a neuron that uses a nonlinear activation function.

Слайд 22

MLP utilizes a supervised learning technique called backpropagation for training. 

MLP utilizes a supervised learning technique called backpropagation for training.

Слайд 23

Hebbian Learning Rule
Delta rule
Backpropagation algorithm

Hebbian Learning Rule Delta rule Backpropagation algorithm

Слайд 24

Hebbian Learning Rule (Hebb's rule)

The Hebbian Learning Rule (1949)
is a learning rule that

Hebbian Learning Rule (Hebb's rule) The Hebbian Learning Rule (1949) is a
specifies how much the weight of the connection between two units should be increased or decreased in proportion to the product of their activation.

Слайд 25

Hebbian Learning Rule (Hebb's rule)

Hebbian Learning Rule (Hebb's rule)

Слайд 27

Delta rule (proposed in 1960)

Delta rule (proposed in 1960)

Слайд 28

The backpropagation algorithm was originally introduced in the 1970s, but its importance

The backpropagation algorithm was originally introduced in the 1970s, but its importance
wasn't fully appreciated until a famous 1986 paper by David Rumelhart, Geoffrey Hinton, and Ronald Williams.

Слайд 29

That paper describes several neural networks where backpropagation works far faster than

That paper describes several neural networks where backpropagation works far faster than
earlier approaches to learning, making it possible to use neural nets to solve problems which had previously been insoluble.

Слайд 30

Supervised Backpropagation – The mechanism of backward error transmission (delta learning rule)

Supervised Backpropagation – The mechanism of backward error transmission (delta learning rule)
is used to modify the weights of the internal (hidden) and output layers

Слайд 31

Back propagation

The back propagation learning algorithm uses the delta-rule.
What this does

Back propagation The back propagation learning algorithm uses the delta-rule. What this
is that it computes the deltas, (local gradients) of each neuron starting from the output neurons and going backwards until it reaches the input layer. 

Слайд 32

The delta rule is derived by attempting to minimize the error in

The delta rule is derived by attempting to minimize the error in
the output of the neural network through gradient descent.

Слайд 33

To compute the deltas of the output neurons though we first have

To compute the deltas of the output neurons though we first have
to get the error of each output neuron.

Слайд 34

That’s pretty simple, since the multi-layer perceptron is a supervised training network

That’s pretty simple, since the multi-layer perceptron is a supervised training network
so the error is the difference between the network’s output and the desired output.
ej(n) = dj(n) – oj(n)
where e(n) is the error vector, d(n) is the desired output vector and o(n) is the actual output vector. 

Слайд 35

Now to compute the deltas:
deltaj(L)(n) = ej(L)(n) * f'(uj(L)(n)) ,
for neuron j

Now to compute the deltas: deltaj(L)(n) = ej(L)(n) * f'(uj(L)(n)) , for
in the output layer L
where f'(uj(L)(n)) is the derivative of the value of the jth neuron of layer L

Слайд 36

The same formula:

The same formula:

Слайд 37

Weight adjustment

Having calculated the deltas for all the neurons we are now

Weight adjustment Having calculated the deltas for all the neurons we are
ready for the third and final pass of the network, this time to adjust the weights according to the generalized delta rule:

Слайд 38

Weight adjustment

Weight adjustment

Слайд 40

Note: For sigmoid activation function Derivative of the function:
S'(x) = S(x)*(1 -

Note: For sigmoid activation function Derivative of the function: S'(x) = S(x)*(1 - S(x))
S(x))

Слайд 42

Cost Function We need a function that will minimize the parameters over our

Cost Function We need a function that will minimize the parameters over
dataset. One common function that is often used is mean squared error

Слайд 43

Squared Error: which we can minimize using gradient descent
A cost function is

Squared Error: which we can minimize using gradient descent A cost function
something you want to minimize. For example, your cost function might be the sum of squared errors over your training set. Gradient descent is a method for finding the minimum of a function of multiple variables. So you can use gradient descent to minimize your cost function. 

Слайд 44

Back-propagation is a gradient descent over the entire networks weight vectors.
In practice,

Back-propagation is a gradient descent over the entire networks weight vectors. In
it often works well and can run multiple times. It minimizes error over all training samples.

Слайд 45

Task 2:
Write a program that can update  weights of neural network using

Task 2: Write a program that can update weights of neural network using backpropagation.
backpropagation.