Hello,

I was wondering why we don't use a convex activation (whose compositions would also be convex) wouldn't we then have a unique minimizer and then boom profit....

Thanks for clarifying :)

This can help, https://stats.stackexchange.com/questions/106334/cost-function-of-neural-network-is-non-convex

## Convexity and NN

