Hello,
I would have one more question about LDA. I understand that the goal is to extract a number of topics from a corpus of document without knowing the topics in advance. LDA will "produce" the \(\alpha \) vectors for each document (distribution of topic into this document) and the \(\beta \) vectors for each topic (distribution of word in the topic). Now, my questions are:

Since the LDA model is a probabilistic model, can we use it to generate new document ? If yes, how can we proceed ?

Can we use LDA directly to predict the topic inside a new document ? Or this isn't the goal of the model and we should use some alternatives processing /modelling ?

Finally, in the homework 6, ex 2 part b. We have to compare the following two probabilities \(P (G = 1|H = 1) > P (G = 1|H = 1, B = 1)\). This problem can be resolved numerically, using the probability in the previous point. But can we solve it intuitively ? I would say yes, as if we know that \(H=1\) this increases the chance of \(C=1\) and therefore the chance of \(G=1\). Now, if we know that \(B=1\) then \(C=1\) is less likely as we have already a way to explain why \(H=1\) (with B). therefore the second probability is smaller than the first one. Is that a correct reasoning ?

I think it is correct only if the variables are "positively correlated" i.e. if B positive, increase the chance of H to be positive for instance. Is it possible to have, in Bayesian Networks, variables that are "negatively correlated", for instance knowing B positive increase the chance of H to be negative ? If so, we cannot use the "intuitive" way of comparing probability right ?

@alessio5 said:
Hello,
I would have one more question about LDA. I understand that the goal is to extract a number of topics from a corpus of document without knowing the topics in advance. LDA will "produce" the \(\alpha \) vectors for each document (distribution of topic into this document) and the \(\beta \) vectors for each topic (distribution of word in the topic). Now, my questions are:

Since the LDA model is a probabilistic model, can we use it to generate new document ? If yes, how can we proceed ?

We can generate a new document. Note that the alpha parameter is not for each document - it is just one vector for the model. To sample a new document we would first sample the topic distribution of the document from the Dirichlet prior with the alpha parameter. Then to generate each word of the document we would first sample the topic of this word from this topic distribution and then sample a word from that topic using the word distribution of that topic.

Can we use LDA directly to predict the topic inside a new document ? Or this isn't the goal of the model and we should use some alternatives processing /modelling ?

I assume you mean to get the topic distribution for a new document. Obviously, one way is to include the new document in the corpus and rerun the inference. There are ways to make it more efficient and and update the model in an online manner (See Online learning for Latent Dirichlet Allocation) . You could also fix the topic word distributions and infer the document topic distribution for new documents by just running the E-step of the EM algorithm.

Finally, in the homework 6, ex 2 part b. We have to compare the following two probabilities \(P (G = 1|H = 1) > P (G = 1|H = 1, B = 1)\). This problem can be resolved numerically, using the probability in the previous point. But can we solve it intuitively ? I would say yes, as if we know that \(H=1\) this increases the chance of \(C=1\) and therefore the chance of \(G=1\). Now, if we know that \(B=1\) then \(C=1\) is less likely as we have already a way to explain why \(H=1\) (with B). therefore the second probability is smaller than the first one. Is that a correct reasoning ?

I think it is correct only if the variables are "positively correlated" i.e. if B positive, increase the chance of H to be positive for instance. Is it possible to have, in Bayesian Networks, variables that are "negatively correlated", for instance knowing B positive increase the chance of H to be negative ? If so, we cannot use the "intuitive" way of comparing probability right ?

The variables in Bayesian Networks can be negatively correlated, positively correlated or uncorrelated. It depends on the particular distribution. You may be able to apply intuition even in case of negative correlation.

## Lda and Bayesian network

Hello,

I would have one more question about LDA. I understand that the goal is to extract a number of topics from a corpus of document without knowing the topics in advance. LDA will "produce" the \(\alpha \) vectors for each document (distribution of topic into this document) and the \(\beta \) vectors for each topic (distribution of word in the topic). Now, my questions are:

Finally, in the homework 6, ex 2 part b. We have to compare the following two probabilities \(P (G = 1|H = 1) > P (G = 1|H = 1, B = 1)\). This problem can be resolved numerically, using the probability in the previous point. But can we solve it intuitively ? I would say yes, as if we know that \(H=1\) this increases the chance of \(C=1\) and therefore the chance of \(G=1\). Now, if we know that \(B=1\) then \(C=1\) is less likely as we have already a way to explain why \(H=1\) (with B). therefore the second probability is smaller than the first one. Is that a correct reasoning ?

I think it is correct only if the variables are "positively correlated" i.e. if B positive, increase the chance of H to be positive for instance. Is it possible to have, in Bayesian Networks, variables that are "negatively correlated", for instance knowing B positive increase the chance of H to be negative ? If so, we cannot use the "intuitive" way of comparing probability right ?

We can generate a new document. Note that the alpha parameter is not for each document - it is just one vector for the model. To sample a new document we would first sample the topic distribution of the document from the Dirichlet prior with the alpha parameter. Then to generate each word of the document we would first sample the topic of this word from this topic distribution and then sample a word from that topic using the word distribution of that topic.

I assume you mean to get the topic distribution for a new document. Obviously, one way is to include the new document in the corpus and rerun the inference. There are ways to make it more efficient and and update the model in an online manner (See Online learning for Latent Dirichlet Allocation) . You could also fix the topic word distributions and infer the document topic distribution for new documents by just running the E-step of the EM algorithm.

The variables in Bayesian Networks can be negatively correlated, positively correlated or uncorrelated. It depends on the particular distribution. You may be able to apply intuition even in case of negative correlation.

## Add comment