Activation function in CNN

Jeterion85

New member
Joined
Mar 6, 2020
Messages
13
I know that using an activation functions in CNN helps to solve the linearity problem and while i can understand it in simple perceptrons i can not quite get it how it solves the linearity problem in the CNN (especially when we apply max pooling ). Can anyone please give me a mathematical example of how the linearity is kept to the next convolutional layer if i don't apply an activation function ?

Thank you in advance!
 
Convolution layer in CNN is just a special case of a regular "fully connected" linear layer. Which means that having convolution layers does not eliminate the need for non-linearities, a.k.a. activation layers.

While max pooling is not a linear operation, it is "kind of linear": sums are not mapped to sums, but multiplication by a scalar produces multiplied output (P(αx)=αP(x)P(\alpha x) = \alpha P(x), where
PP is max pooling function, xx is its input and α\alpha is an arbitrary scalar. You can say that max pooling "isn't non-linear enough" to warrant elimination of standard non-linearities.


)
 
Top