Updating the Experts

Next: Updating the Gates Up: CEM for Gaussian Mixture Previous: CEM for Gaussian Mixture

Updating the Experts

To update the experts, we hold the gates fixed and merely take derivatives with respect to the expert parameters ( $\nu^m,\Gamma^m,\Omega^m$ or call them $\Phi_1^m,\Phi_2^m,\Phi_3^m$ ). This derivative is simplified and set equal to zero as in Equation 7.17. This is our update equation for the experts. Note how each expert has been decoupled from the other experts, from the gates and from other distracting terms. The maximization reduces to the derivative of the logarithm of a single conditioned Gaussian. This can be done analytically with some matrix differential theory and, in fact, the computation resembles a conditioned Gaussian maximum-likelihood estimate in Equation 7.18 [37]. The responsibilities for the assignments are the standard normalized joint responsibilities (as in EM). The update equation is effectively the maximum conditional likelihood solution for a conditioned multivariate Gaussian distribution.

$\displaystyle \begin{array}{lll} \frac{\partial Q(\Theta^t,\Theta^{(t-1)})} {\p... ...bf y}_i;\nu^m + \Gamma^m {\bf x}_i,\Omega^m)}{\partial \Phi^m} := 0 \end{array}$

(7.17)

$\displaystyle \begin{array}{lll} \mu^m & := & \left ( \sum_{i=1}^N {\hat h}_{im... ...:= & \Sigma_{yy}^m - \Sigma_{yx}^m(\Sigma_{xx}^m)^{-1}\Sigma_{xy}^m \end{array}$

(7.18)

We observe monotonic increases when the above maximization is iterated with the bounding step (i.e. estimation of ${\hat h}_im$ ). This result is not too surprising since we are effectively applying a version of EM to update the experts. At this point, we turn our attention to the still unoptimized gates and mixing proportions.

Next: Updating the Gates Up: CEM for Gaussian Mixture Previous: CEM for Gaussian Mixture

Tony Jebara
1999-09-15