Theorem of the paper by Barron

This forum is inactive. Browsing/searching possible.

Connect your moderator Slack workspace to receive post notifications:

Theorem of the paper by Barron

Hello,
In the lesson neural nets - representation power, I don't understand why the lemma in page 3 applies only for activation function that are "sigmoid-like" and not for others ?
Thank you for your answer.

11 Nov '21 ·

anonymous

Hi,

The approximation result by Barron tells you it is possible to approximate some function by a one-hidden-layer NN with sigmoid-like activation functions (see the paper http://www.stat.yale.edu/~arb4/publications_files/UniversalApproximationBoundsForSuperpositionsOfASigmoidalFunction.pdf)
The reason it requires sigmoid functions comes from the proof:

first, you approximate your function by a sum of rectangles
then you approximate your rectangles by the sum of two sigmoids

You see that the particular shape of the sigmoid is important here: you need your activation function to be able to approximate a step function.

Note however that it exists other approximation results for NN with other activations functions. For example, we saw one about pointwise approximation using RELU NNs at the end of the lecture too.

Best,
Nicolas

1

12 Nov '21 · 1 ·

Nicolas Flammarion admin

Page 1 of 1

Add comment

How to style: strictly use the or click here. E.g., \(\alpha + \beta\) gives (inline) \(\alpha + \beta\). No \(\LaTeX\) preview (yet).