Activation function in CNN

Jeterion85 · Oct 5, 2021

I know that using an activation functions in CNN helps to solve the linearity problem and while i can understand it in simple perceptrons i can not quite get it how it solves the linearity problem in the CNN (especially when we apply max pooling ). Can anyone please give me a mathematical example of how the linearity is kept to the next convolutional layer if i don't apply an activation function ?

Thank you in advance!

blamocur · Nov 2, 2021

Convolution layer in CNN is just a special case of a regular "fully connected" linear layer. Which means that having convolution layers does not eliminate the need for non-linearities, a.k.a. activation layers.

While max pooling is not a linear operation, it is "kind of linear": sums are not mapped to sums, but multiplication by a scalar produces multiplied output (

P(\alpha x) = \alpha P(x)

, where

P

is max pooling function,

x

is its input and

\alpha

is an arbitrary scalar. You can say that max pooling "isn't non-linear enough" to warrant elimination of standard non-linearities.

)

Activation function in CNN

Jeterion85

New member

blamocur

Elite Member