You are right. Those are probably interchangeable. It is a matter of notational accuracy. For expectations, you could always specify what the expectation is over. In SGD, this could be over the randomness of the last stochastic gradient, or over the randomness of all stochastic gradients observed so far. The conditioning on x_t makes it clear that we only care about the randomness of the current stochastic gradient. In Q 24 this isn't specified in the formula. Maybe the writer expects it to be clear from the context.

## Question Expectation

Hello,

I have troubles understanding the difference between uses of conditional expectation and expectation for certain proofs:

For instance in the 2019 exam we have:

Q 22 \( \mathbb{E}[g_t|x_t] = \sum^n_{i=1} p_i g_t \)

Whereas in Q 24 we have \( \mathbb{E}[\| g_t \|^2] = \sum^n_{i=1} p_i \| g_t \|^2 \)

It looks to me we use it in an interchangeable manner, since we use the conditional result of Q22 in a non conditional one in Q24.

Is that correct, or is there a specific condition that brings the need for a conditional on \( x_t \)?

Thanks in advance for any help!

Best

Yann

Hi Yann,

You are right. Those are probably interchangeable. It is a matter of notational accuracy. For expectations, you could always specify what the expectation is over. In SGD, this could be over the randomness of the last stochastic gradient, or over the randomness of all stochastic gradients observed so far. The conditioning on x_t makes it clear that we only care about the randomness of the current stochastic gradient. In Q 24 this isn't specified in the formula. Maybe the writer expects it to be clear from the context.

Does that help?

Thank you thats way clearer!

## Add comment