Probabilistic Latent Semantic Analysis (PLSA)
Paper note
We have:
- Documents Set
- Vocabulary Set
- Topics
We want to find:
- Topic Word Distribution: Word distribution the topics, for all and with the constraint .
- Document Topic Coverage: Coverage of each topics for document , all for and with constraint .
For the rest of the note, I am skipping the background distribution for ease of mathematical notations. We define, parameter set , for . Marginal likelihood of this problem can be written as follows:
Here, since the word generation process doesn’t depend on the document, rather the topic.
E-step
Using Bayes Rule,
M-step
Using Lagrange multipler,
We get:
Summary
- Initialize , for .
For Steps m=1, 2, … ,Do the following:
E-step: For and
M-step:
No comments:
Post a Comment