What one thing helped ResNet architectures avoid the Vanishing gradient problem?
Anonimo
As an engineer I don't often deal with that level of problem, I can figure it out if it's an issue. Generally when I retrain a neural network I have other problems than the vanishing gradient issue.