Could you please give more details on the complexity of the K cross validation ?
We are training a dataset on K times on the union of K-1 groups of N/K samples. It's not very clear whether the complexity should be O(K) or O(K(K-1)).
Take N=K, we train on K-1 samples and test on 1 sample. We do it K times. Here we obviously have O(K(K-1)).
In the solution it says that we are training on (K-1)/K fraction of the data. I don't understand why we use the fraction of the data instead of the number of groups.
per k-fold, you train on n(k-1)/k of the data. then repeat k times, so you are training basically \(O( \frac{n(k-1)}{k}k) = O(nk) = O(k)\), ie the dependency on k is linear. The training size \(n\) does not change. This says that if we double the number of folds, then training time doubles. The number of groups is k, and each group represents k-1/k fraction of the original data. Also, there was a very similar thread earlier, so maybe check that out: https://www.oknoname.com/EPFL/CS433/topic/2944/a-few-solutions/
Cross validation complexity
Hi,
Could you please give more details on the complexity of the K cross validation ?
We are training a dataset on K times on the union of K-1 groups of N/K samples. It's not very clear whether the complexity should be O(K) or O(K(K-1)).
Take N=K, we train on K-1 samples and test on 1 sample. We do it K times. Here we obviously have O(K(K-1)).
In the solution it says that we are training on (K-1)/K fraction of the data. I don't understand why we use the fraction of the data instead of the number of groups.
Thank you
per k-fold, you train on n(k-1)/k of the data. then repeat k times, so you are training basically \(O( \frac{n(k-1)}{k}k) = O(nk) = O(k)\), ie the dependency on k is linear. The training size \(n\) does not change. This says that if we double the number of folds, then training time doubles. The number of groups is k, and each group represents k-1/k fraction of the original data. Also, there was a very similar thread earlier, so maybe check that out: https://www.oknoname.com/EPFL/CS433/topic/2944/a-few-solutions/
2
Add comment