Exploring why neurons go dead ?
Dead neurons are basically neurons that stop responding to any gradient updates as they have already been set to 0. For the unaware, this is regarding updates to learnable parameters in a neural network, specifically for the case when the gradient saturates and therefore any updates received are ignored.
Mention gradient overflow, underflow, incorrect initialization of weights
How can we mitigate neurons going dead ?
to be updated