Community

Simple RNN and activation function  

  RSS

meenu
(@meenu)
New Member
Joined: 3 months ago
Posts: 1
17/12/2018 10:34 pm  

Why generally tanH is used as activation function in simple RNN.

is the reason related to handle vanishing gradient problem ?


ReplyQuote
cl4ptp
(@cl4ptp)
New Member
Joined: 5 months ago
Posts: 4
09/01/2019 7:10 pm  

The shape of tanh is not very different from that of Sigmoid, and in the large-x-value regime, the gradient is going to tiny compared with ReLu.

In the context of RNN, I think the oddness of tanh could be a desirable property.


ReplyQuote
Share:
Subscribe to our newsletter

We use cookies to collect information about our website and how users interact with it. We’ll use this information solely to improve the site. You are agreeing to consent to our use of cookies if you click ‘OK’. All information we collect using cookies will be subject to and protected by our Privacy Policy, which you can view here.

OK
  
Working

Please Login or Register