Saturday, December 9, 2017

PLSA using EM

PLSA using EM

Probabilistic Latent Semantic Analysis (PLSA)

We have:

  • Documents Set
  • Vocabulary Set
  • Topics

We want to find:

  • Topic Word Distribution: Word distribution the topics, for all and with the constraint .
  • Document Topic Coverage: Coverage of each topics for document , all for and with constraint .

For the rest of the note, I am skipping the background distribution for ease of mathematical notations. We define, parameter set , for . Marginal likelihood of this problem can be written as follows:


Here, since the word generation process doesn’t depend on the document, rather the topic.

E-step

Using Bayes Rule,

M-step


Using Lagrange multipler,

We get:

Summary

  • Initialize , for .

For Steps m=1, 2, … ,Do the following:

  • E-step: For and

  • M-step:

No comments:

Post a Comment