Sigmoid (prime) unclear (Beginner): neural networks

Lumor

New member
Joined
Oct 2, 2018
Messages
3
First of all, i'm not really good at math but i have a programmer background. (Also because of my ignorance i'm not sure if i put this in the right category)
I have an interest in neural networks and i'm trying to learn that, however that requires some math. I'm trying to learn but i'm a bit stuck at the guide i'm following about neural networks.

The guide in question :
https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/


So i figured out how to calculate a sigmoid function.

1 / (1+e ^x)
1 / (1+2.718 ^x)

(I'm not sure if ^ is the right notation, srry)

For x=1.3 the answer would be 0.78583498304. So that's great... one step closer.


Where i'm stuck is at the Back Propagation section (Halfway down the page)
He gives this as the math :
Delta output sum = S'(sum) * (output sum margin of error)
Delta output sum = S'(1.235) * (-0.77)
Delta output sum = -0.13439890643886018


1.235 * -0.77 is not -0.13439890643886018 so clearly the S' has some kind of hidden meaning, but what?
Did he write an incomplete workout of the problem, or am i just stupid/missing something?
Searching on Google for Sigmoid Prime just overwhelms me with all kinds of maths.


Like i said, i'm really not good at maths but i want to learn more about it. So if anyone wants to help, use simple explanations please :p Thanks in advance!
 
First of all, i'm not really good at math but i have a programmer background. (Also because of my ignorance i'm not sure if i put this in the right category)
I have an interest in neural networks and i'm trying to learn that, however that requires some math. I'm trying to learn but i'm a bit stuck at the guide i'm following about neural networks.

The guide in question :
https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/


So i figured out how to calculate a sigmoid function.

1 / (1+e ^x) <-- No, it's 1/(1 + e^-x)
1 / (1+2.718 ^x)

(I'm not sure if ^ is the right notation, srry) <-- Notation is fine

For x=1.3 the answer would be 0.78583498304. So that's great... one step closer. <-- Here you did use e^-x.

Where i'm stuck is at the Back Propagation section (Halfway down the page)
He gives this as the math :
Delta output sum = S'(sum) * (output sum margin of error)
Delta output sum = S'(1.235) * (-0.77)
Delta output sum = -0.13439890643886018

1.235 * -0.77 is not -0.13439890643886018 so clearly the S' has some kind of hidden meaning, but what?
Did he write an incomplete workout of the problem, or am i just stupid/missing something?
Searching on Google for Sigmoid Prime just overwhelms me with all kinds of maths.

Like i said, i'm really not good at maths but i want to learn more about it. So if anyone wants to help, use simple explanations please :p Thanks in advance!

Properly speaking, this should go under calculus; they mention it near the top: "The only prerequisites are having a basic understanding of JavaScript, high-school Calculus, and simple matrix operations. Other than that, you don’t need to know anything."

When they write S', they are referring to the derivative of the function S. Do you recall what that means? They also call it "dsum/dresult", which is the other notation for a derivative. It does seem odd that they never actually mentioned taking the derivative of S in particular, or showed what it is.

Do you need help finding the derivative?
 
Properly speaking, this should go under calculus; they mention it near the top: "The only prerequisites are having a basic understanding of JavaScript, high-school Calculus, and simple matrix operations. Other than that, you don’t need to know anything."
Ah i see, looks like i skimmed that part ^^

When they write S', they are referring to the derivative of the function S. Do you recall what that means? They also call it "dsum/dresult", which is the other notation for a derivative. It does seem odd that they never actually mentioned taking the derivative of S in particular, or showed what it is.

Do you need help finding the derivative?


Yes, if you could show me what he did to get to -0.1344 that would be appreciated.


So if i'm getting this correctly: sigmoid prime = S' = derivative of sigmoid?
 
Ah i see, looks like i skimmed that part ^^




Yes, if you could show me what he did to get to -0.1344 that would be appreciated.


So if i'm getting this correctly: sigmoid prime = S' = derivative of sigmoid?

Yeah that's right.

S=11+ex=(1+ex)1\displaystyle S=\frac{1}{1+e^{-x}} = (1 + e^{-x})^{-1}

So S=1(1+ex)2(1)ex\displaystyle S' = -1*(1+e^{-x})^{-2}*(-1)e^{-x}….. using the Chain Rule

So S=1ex(1+ex)2\displaystyle S' = \frac{1}{e^x * (1+e^{-x})^2}

Substitute x = 1.245 into that, and then multiply by -0.77 to get the required result.
 
Last edited:
Yeah that's right.

S=11+ex=(1+ex)1\displaystyle S=\frac{1}{1+e^{-x}} = (1 + e^{-x})^{-1}

So S=1(1+ex)2(1)ex\displaystyle S' = -1*(1+e^{-x})^{-2}*(-1)e^{-x}….. using the Chain Rule

So S=1ex(1+ex)2\displaystyle S' = \frac{1}{e^x * (1+e^{-x})^2}

Substitute x = 1.245 into that, and then multiply by -0.77 to get the required result.

Thank you, that works perfectly. I should probably read about that chain rule to understand what's actually happening.
 
… I should probably read about that chain rule to understand what's actually happening.
The chain rule concerns differentiating composite functions (i.e., functions whose input is actually another function's output).

When two functions are composed, one function is "inside" the other. The function you posted (rewritten by Harrycat) has 1-e^(-x) as the inner function and (…)^(-1) as the outer function. In other words:

f(x) = 1 - e^(-x)

S(x) = [ f(x) ]^(-1)

The derivative of function S is the product of the outer derivative times the inner derivative. In the image below, the chain rule is shown symbolically, in two ways. The first way shows an example where the outer function is a power function x^n. That matches your situation with S(x) above, where n=-1. :cool:

chain.JPG
 
Last edited:
Top