According to the given information, the log likelihood should be proportional to (data.T×covariance×data) which cannot be computed in our case due to dimension issues. But the solution says data(data×covariance) will work and give an n×d matrix. I don't understand why we can make such alteration. Can someone be so kind as to explain this for me? Thanks. (× stands for matrix multiplication; is the normal product of the corresponding elements)
the log likelihood should be proportional to (data.T×covariance×data) which cannot be computed in our case due to dimension issues
Please note that this formula is specified with respect to a single example of size (d,1). When batching multiple examples together in a matrix of size (n,d), the formulas need to be adapted to accomodate for this organization of datapoints.
@arnout_jan_J, Thank for the explanation. Let me rephrase it: n×d matrix means there are n samples, each has d feature and they are assembled together. For a single sample A, its log-likelihood is proportional to A.T×cov^{-1}×A where A's dimension is (d,1) and cov's dimension is (d,d). The final result will be a single number. Then for n samples, that calculation will repeat for n times and got a log-likelihood array with dimension (n,). And in python we can achieve this n-time calculation by using matrix calculation with np.dot(residual, inverse covariance) and normal product with residual * np.dot(XX). Is it correct?
For a single sample A, its log-likelihood is proportional to A.T×cov^{-1}×A where A's dimension is (d,1) and cov's dimension is (d,d). The final result will be a single number.
Correct. Note that for a single (column vector) sample lowercase notation a (instead of A) is preferred.
Then for n samples, that calculation will repeat for n times and got a log-likelihood array with dimension (n,).
Correct
And in python we can achieve this n-time calculation by using matrix calculation with np.dot(residual, inverse covariance) and normal product with residual * np.dot(XX). Is it correct?
Z = np.dot(residual, inverse covariance) --> Yes residual * np.dot(XX) --> No, looking at the solution it should be np.sum(dxm * Z, axis=1) (where * is elementwise multiplication!)
task C multi-variance likelihood
According to the given information, the log likelihood should be proportional to (data.T×covariance×data) which cannot be computed in our case due to dimension issues. But the solution says data(data×covariance) will work and give an n×d matrix. I don't understand why we can make such alteration. Can someone be so kind as to explain this for me? Thanks. (× stands for matrix multiplication; is the normal product of the corresponding elements)
Please note that this formula is specified with respect to a single example of size (d,1). When batching multiple examples together in a matrix of size (n,d), the formulas need to be adapted to accomodate for this organization of datapoints.
A similar question (& solution) can be found here: http://oknoname.herokuapp.com/forum/topic/46/exercise-3-formula/
@arnout_jan_J, Thank for the explanation. Let me rephrase it: n×d matrix means there are n samples, each has d feature and they are assembled together. For a single sample A, its log-likelihood is proportional to A.T×cov^{-1}×A where A's dimension is (d,1) and cov's dimension is (d,d). The final result will be a single number. Then for n samples, that calculation will repeat for n times and got a log-likelihood array with dimension (n,). And in python we can achieve this n-time calculation by using matrix calculation with np.dot(residual, inverse covariance) and normal product with residual * np.dot(XX). Is it correct?
@jiaxi,
Correct. Note that for a single (column vector) sample lowercase notation a (instead of A) is preferred.
Correct
Z = np.dot(residual, inverse covariance)
--> Yesresidual * np.dot(XX)
--> No, looking at the solution it should benp.sum(dxm * Z, axis=1)
(where*
is elementwise multiplication!)Add comment