Shafi's ML Blog: PLSA using EM

PLSA using EM

Probabilistic Latent Semantic Analysis (PLSA)

Paper note

We have:

Documents Set $C=\{d_1,d_2,\ldots\}$
Vocabulary Set $V=\{w_1,w_2\ldots\}$
$k$ Topics $\{z_1,z_2,\ldots,z_k\}$

We want to find:

Topic Word Distribution: Word distribution the topics, $\{p(w|z_i)\}$ for all $w\in V$ and $i=1,\ldots,k$ with the constraint $\sum_{w\in V}p(w|z_i)=1$ .
Document Topic Coverage: Coverage of each topics for document $d$ , $\{p(z_i|d)\}$ all for $d\in C$ and $i=1,\ldots,k$ with constraint $\sum_{j=1}^k p(z_j|d) =1$ .

For the rest of the note, I am skipping the background distribution for ease of mathematical notations. We define, parameter set $\Lambda =\{p(z_j|d), p(w|z_j)\}$ , for $w\in V, d\in C, j=1,\ldots,k$ . Marginal likelihood of this problem can be written as follows:

$\begin{eqnarray} &&\arg\max_\Lambda \log p(C,W)\\ =&&\arg\max_\Lambda \sum_{d\in C}\sum_{w\in V}c(w,d)\log p(w,d)\\ =&&\arg\max_\Lambda \sum_{d\in C}\sum_{w\in V}c(w,d)\log p(w|d)p(d)\\ =&&\arg\max_\Lambda \sum_{d\in C}\sum_{w\in V}c(w,d)\log\sum_{j=1}^k p(w|z_j,d)p(z_j|d)\\ =&&\arg\max_\Lambda \sum_{d\in C}\sum_{w\in V}c(w,d)\log\sum_{j=1}^k p(w|z_j)p(z_j|d)\\ \end{eqnarray}$
Here, $p(w|z_j,d)=p(w|z_j)$ since the word generation process doesn’t depend on the document, rather the topic.

E-step

$\text{Set } q(z_j) = p(z_j|w,d)$

Using Bayes Rule,
$p(z_j|w,d)=\dfrac{p(w|z_j,d)p(z_j|d)}{\sum_{j=1}^k p(w|z_j,d)p(z_j|d)}=\dfrac{p(w|z_j)p(z_j|d)}{\sum_{j=1}^k p(w|z_j)p(z_j|d)}$

M-step

$\begin{eqnarray} &&\max _{\boldsymbol {\Lambda}}\sum_{d\in C}\sum_{w\in V}c(w,d) \sum_{j=1}^k {q(z_j)}\log p(w,d,z_j|\Lambda)\\ &&\max _{\boldsymbol {\Lambda}}\sum_{d\in C}\sum_{w\in V}c(w,d) \sum_{j=1}^k {q(z_j)}\log p(w|d,z_j)p(z_j|d)p(d)\\ &&\max _{\boldsymbol {\Lambda}}\sum_{d\in C}\sum_{w\in V}c(w,d) \sum_{j=1}^k {q(z_j)}\log p(w|z_j)p(z_j|d) \end{eqnarray}$
Using Lagrange multipler,
$\mathcal{H}=\sum_{d\in C}\sum_{w\in V}c(w,d) \sum_{j=1}^k {q(z_j)}\log p(w|z_j)p(z_j|d)+ \sum_{j=1}^k \tau_k (1-\sum_{w\in V}p(w|z_i))+ \sum_{d\in C}\rho_d(1-\sum_{j=1}^k p(z_j|d) )$
We get:
$p(w|z_j)=\dfrac{\sum_{d\in C} c(d,w)p(z_j|w,d)}{\sum_{w\in V}\sum_{d\in C} c(d,w)p(z_j|w,d)}$
$p(z_j|d)=\dfrac{\sum_{w\in V}c(d,w)p(z_j|w,d)}{\sum_{w\in V}c(d,w)}$

Summary

Initialize $\{p^{(0)}(z_j|d), p^{(0)}(w|z_j)\}$ , for $w\in V, d\in C, j=1,\ldots,k$ .

For Steps m=1, 2, … ,Do the following:

E-step: For $i=1,\ldots,k$ and $d\in C$
$q^{(m)}(z_j)= p^{(m)}(z_j|w,d)=\dfrac{p^{(m)}(w|z_j)p^{(m)}(z_j|d)}{\sum_{j=1}^k p^{(m)}(w|z_j)p^{(m)}(z_j|d)}$

M-step:
$p^{(m+1)}(w|z_j)=\dfrac{\sum_{d\in C} c(d,w)p^{(m)}(z_j|w,d)}{\sum_{w\in V}\sum_{d\in C} c(d,w)p^{(m)}(z_j|w,d)}$
$p^{(m+1)}(z_j|d)=\dfrac{\sum_{w\in V}c(d,w)p^{(m)}(z_j|w,d)}{\sum_{w\in V}c(d,w)}$

Shafi's ML Blog

Saturday, December 9, 2017

PLSA using EM

Probabilistic Latent Semantic Analysis (PLSA)

E-step

M-step

Summary

No comments:

Post a Comment