Sigmoid (prime) unclear (Beginner): neural networks

Lumor · Oct 2, 2018

First of all, i'm not really good at math but i have a programmer background. (Also because of my ignorance i'm not sure if i put this in the right category)
I have an interest in neural networks and i'm trying to learn that, however that requires some math. I'm trying to learn but i'm a bit stuck at the guide i'm following about neural networks.

The guide in question :
https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/

So i figured out how to calculate a sigmoid function.

1 / (1+e ^x)
1 / (1+2.718 ^x)

(I'm not sure if ^ is the right notation, srry)

For x=1.3 the answer would be 0.78583498304. So that's great... one step closer.

Where i'm stuck is at the Back Propagation section (Halfway down the page)
He gives this as the math :
Delta output sum = S'(sum) * (output sum margin of error)
Delta output sum = S'(1.235) * (-0.77)
Delta output sum = -0.13439890643886018

1.235 * -0.77 is not -0.13439890643886018 so clearly the S' has some kind of hidden meaning, but what?
Did he write an incomplete workout of the problem, or am i just stupid/missing something?
Searching on Google for Sigmoid Prime just overwhelms me with all kinds of maths.

Like i said, i'm really not good at maths but i want to learn more about it. So if anyone wants to help, use simple explanations please

Thanks in advance!

Dr.Peterson · Oct 2, 2018

Lumor said:
First of all, i'm not really good at math but i have a programmer background. (Also because of my ignorance i'm not sure if i put this in the right category)
I have an interest in neural networks and i'm trying to learn that, however that requires some math. I'm trying to learn but i'm a bit stuck at the guide i'm following about neural networks.

The guide in question :
https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/

So i figured out how to calculate a sigmoid function.

1 / (1+e ^x) <-- No, it's 1/(1 + e^-x)
1 / (1+2.718 ^x)

(I'm not sure if ^ is the right notation, srry) <-- Notation is fine

For x=1.3 the answer would be 0.78583498304. So that's great... one step closer. <-- Here you did use e^-x.

Where i'm stuck is at the Back Propagation section (Halfway down the page)
He gives this as the math :
Delta output sum = S'(sum) * (output sum margin of error)
Delta output sum = S'(1.235) * (-0.77)
Delta output sum = -0.13439890643886018

1.235 * -0.77 is not -0.13439890643886018 so clearly the S' has some kind of hidden meaning, but what?
Did he write an incomplete workout of the problem, or am i just stupid/missing something?
Searching on Google for Sigmoid Prime just overwhelms me with all kinds of maths.

Like i said, i'm really not good at maths but i want to learn more about it. So if anyone wants to help, use simple explanations please Thanks in advance!

Properly speaking, this should go under calculus; they mention it near the top: "The only prerequisites are having a basic understanding of JavaScript, high-school Calculus, and simple matrix operations. Other than that, you don’t need to know anything."

When they write S', they are referring to the derivative of the function S. Do you recall what that means? They also call it "dsum/dresult", which is the other notation for a derivative. It does seem odd that they never actually mentioned taking the derivative of S in particular, or showed what it is.

Do you need help finding the derivative?

Lumor · Oct 2, 2018

Dr.Peterson said:
Properly speaking, this should go under calculus; they mention it near the top: "The only prerequisites are having a basic understanding of JavaScript, high-school Calculus, and simple matrix operations. Other than that, you don’t need to know anything."

Ah i see, looks like i skimmed that part ^^

Dr.Peterson said:
When they write S', they are referring to the derivative of the function S. Do you recall what that means? They also call it "dsum/dresult", which is the other notation for a derivative. It does seem odd that they never actually mentioned taking the derivative of S in particular, or showed what it is.

Do you need help finding the derivative?

Yes, if you could show me what he did to get to -0.1344 that would be appreciated.

So if i'm getting this correctly: sigmoid prime = S' = derivative of sigmoid?

Harry_the_cat · Oct 3, 2018

Lumor said:
Ah i see, looks like i skimmed that part ^^

Yes, if you could show me what he did to get to -0.1344 that would be appreciated.

So if i'm getting this correctly: sigmoid prime = S' = derivative of sigmoid?

Yeah that's right.

\displaystyle S=\frac{1}{1+e^{-x}} = (1 + e^{-x})^{-1}

So

\displaystyle S' = -1*(1+e^{-x})^{-2}*(-1)e^{-x}

….. using the Chain Rule

So

\displaystyle S' = \frac{1}{e^x * (1+e^{-x})^2}

Substitute x = 1.245 into that, and then multiply by -0.77 to get the required result.

stapel · Oct 3, 2018

Dr.Peterson said:
Properly speaking, this should go under calculus...

Moved.

Lumor · Oct 4, 2018

Harry_the_cat said:
Yeah that's right.

$\displaystyle S=\frac{1}{1+e^{-x}} = (1 + e^{-x})^{-1}$

So $\displaystyle S' = -1*(1+e^{-x})^{-2}*(-1)e^{-x}$ ….. using the Chain Rule

So $\displaystyle S' = \frac{1}{e^x * (1+e^{-x})^2}$

Substitute x = 1.245 into that, and then multiply by -0.77 to get the required result.

Thank you, that works perfectly. I should probably read about that chain rule to understand what's actually happening.

mmm4444bot · Oct 4, 2018

Lumor said:
… I should probably read about that chain rule to understand what's actually happening.

The chain rule concerns differentiating composite functions (i.e., functions whose input is actually another function's output).

When two functions are composed, one function is "inside" the other. The function you posted (rewritten by Harrycat) has 1-e^(-x) as the inner function and (…)^(-1) as the outer function. In other words:

f(x) = 1 - e^(-x)

S(x) = [ f(x) ]^(-1)

The derivative of function S is the product of the outer derivative times the inner derivative. In the image below, the chain rule is shown symbolically, in two ways. The first way shows an example where the outer function is a power function x^n. That matches your situation with S(x) above, where n=-1. :cool:

Sigmoid (prime) unclear (Beginner): neural networks

Lumor

New member

Dr.Peterson

Elite Member

Lumor

New member

Harry_the_cat

Elite Member

stapel

Super Moderator

Lumor

New member

mmm4444bot

Super Moderator