Gaussian Mixture Model (GMM)
Similar to K-means clustering, however with a probabilistic flavor. That is, each point can belongs to more than one cluster, which is described by a probability distribution.
In the figure above, we will cluster the points using a mixture of 3 Gaussian distribution of the form , . The proportions of the Gaussian are given by . Likelihood is defined as follows:
We want to maximize the likelihood , where , . We have the following constraint:
Due to the above constraints, it is difficult to solve the problem using methods like gradient descent. So, instead, we will use EM algorithm to solve this problem.
We will convert the above likelihood function to a marginal likelihood function by introducing latent variable for . The marginal likelihood of GMM in this example as follows:
Note that, here, we have a total of probability values associated with the latent variable . However, GMM is a special case in that these mixture weights are same for all data points. So, we will only have 3 values. This is in contrast to method like PLSA, where the mixture weights are different for each data points. See here for GMM vs. PLSA. In addition, in GMM, is represented by Gaussian.
GMM: Mixture weigths are same among all data points. Also is Gaussian.
E-step
For each ,
Using Bayes Rule,
M-step
Using the assumptions specific for GMM, we can re-write the above expression as follows for 1-D input data points:
Now to determine , we set derivative to zero.
Similarly,
Solving is similar, however it also includes the constraint, and .
Summary
- Initialize
For Steps m=1, 2, … ,Do the following:
E-step: For
M-step:
No comments:
Post a Comment