When using Hoeffding's inequality in Lecture 4a to estimate the decrease of the generalization error with larger test datasets, we need to assume that the losses at each point are random variables in [a,b]. Clearly the loss, e.g. MSE, could be (theoretically) in [0, infty). Is this part referring to the actual realizations instead of random variables as such?
Indeed some losses that are used in practice are not bounded (mostly regression). There is a general form of Hoeffding's inequality that applies to sub Gaussian random variables, still these are nice distributions that are not necessarily representative of real data distributions. You can refer to concentration inequalities (like Azuma for martingales, McDiarmid for non sums). In statistical learning theory they mostly focus on binary classification for which the loss is bounded and you don't have this problem.
Lecture 4a: Hoeffding's inequality
When using Hoeffding's inequality in Lecture 4a to estimate the decrease of the generalization error with larger test datasets, we need to assume that the losses at each point are random variables in [a,b]. Clearly the loss, e.g. MSE, could be (theoretically) in [0, infty). Is this part referring to the actual realizations instead of random variables as such?
Indeed some losses that are used in practice are not bounded (mostly regression). There is a general form of Hoeffding's inequality that applies to sub Gaussian random variables, still these are nice distributions that are not necessarily representative of real data distributions. You can refer to concentration inequalities (like Azuma for martingales, McDiarmid for non sums). In statistical learning theory they mostly focus on binary classification for which the loss is bounded and you don't have this problem.
1
Add comment