7.4 Example: Covariance Selection

This example illustrates the use of the routines for sparse Cholesky factorization. We consider the problem
\begin{displaymath}
\begin{array}{ll}
\mbox{minimize} & -\log\det K + \mathop{...
...box{subject to} & K_{ij}=0,\quad (i,j) \not \in S.
\end{array}\end{displaymath} (7.5)

The optimization variable is a symmetric matrix K of order n and the domain of the problem is the set of positive definite matrices. The matrix $Y$ and the index set $S$ are given. We assume that all the diagonal positions are included in $S$. This problem arises in maximum likelihood estimation of the covariance matrix of a zero-mean normal distribution, with constraints that specify that pairs of variables are conditionally independent.

We can express K as

\begin{displaymath}
K(x) = E_1\mbox{\bf diag}\,(x)E_2^T+E_2\mbox{\bf diag}\,(x)E_1^T
\end{displaymath}

where x are the nonzero elements in the lower triangular part of K, with the diagonal elements scaled by 1/2, and

\begin{displaymath}
E_1 = \left[ \begin{array}{cccc}
e_{i_1} & e_{i_2} & \cdot...
...cc}
e_{j_1} & e_{j_2} & \cdots & e_{j_q} \end{array}\right],
\end{displaymath}

where (i_k, j_k) are the positions of the nonzero entries in the lower-triangular part of K. With this notation, we can solve problem (7.5) by solving the unconstrained problem

\begin{displaymath}
\begin{array}{ll}
\mbox{minimize} & f(x) = -\log\det K(x) + \mathop{\bf tr}(K(x)Y).
\end{array}\end{displaymath}

The code below implements Newton's method with a backtracking line search. The gradient and Hessian of the objective function are given by

\begin{eqnarray*}
\nabla f(x)
&=& 2 \mbox{\bf diag}\,( E_1^T (Y - K(x)^{-1}) E_...
...\left(K(x)^{-1}\right)_{IJ} \circ
\left(K(x)^{-1}\right)_{JI},
\end{eqnarray*}


where o denotes Hadamard product.

from cvxopt.base import matrix, spmatrix, log, mul
from cvxopt import blas, lapack, amd, cholmod

def covsel(Y):
    """
    Returns the solution of

         minimize    -logdet K + Tr(KY)
         subject to  K_{ij}=0,  (i,j) not in indices listed in I,J.

    Y is a symmetric sparse matrix with nonzero diagonal elements.
    I = Y.I,  J = Y.J.
    """

    I, J = Y.I, Y.J
    n, m = Y.size[0], len(I) 
    N = I + J*n         # non-zero positions for one-argument indexing 
    D = [k for k in xrange(m) if I[k]==J[k]]  # position of diagonal elements

    # starting point: symmetric identity with nonzero pattern I,J
    K = spmatrix(0, I, J) 
    K[::n+1] = 1

    # Kn is used in the line search
    Kn = spmatrix(0, I, J)

    # symbolic factorization of K 
    F = cholmod.symbolic(K)

    # Kinv will be the inverse of K
    Kinv = matrix(0.0, (n,n))
    
    for iters in xrange(100):

        # numeric factorization of K
        cholmod.numeric(K, F)
        d = cholmod.diag(F)

        # compute Kinv by solving K*X = I 
        Kinv[:] = 0
        Kinv[::n+1] = 1
        cholmod.solve(F, Kinv)

        # solve Newton system
        grad = 2*(Y.V - Kinv[N])
        hess = 2*(mul(Kinv[I,J],Kinv[J,I]) + mul(Kinv[I,I],Kinv[J,J]))
        v = -grad
        lapack.posv(hess,v) 
        
        # stopping criterion
        sqntdecr = -blas.dot(grad,v) 
        print "Newton decrement squared:%- 7.5e" %sqntdecr
        if (sqntdecr < 1e-12):
            print "number of iterations: ", iters+1 
            break

        # line search
        dx = +v
        dx[D] *= 2      # scale the diagonal elems        
        f = -2.0 * sum(log(d))    # f = -log det K
        s = 1
        for lsiter in xrange(50):
            Kn.V = K.V + s*dx
            try: 
                cholmod.numeric(Kn, F)
            except ArithmeticError:
                s *= 0.5
            else:
                d = cholmod.diag(F)
                fn = -2.0 * sum(log(d)) + 2*s*blas.dot(v,Y.V)
                if (fn < f - 0.01*s*sqntdecr): 
                     break
                s *= 0.5
            
        K.V = Kn.V

    return K