Quickstart

Pymanopt is a modular toolbox and hence easy to use. All of the automatic differentiation is done behind the scenes so that the amount of setup the user needs to do is minimal. Usually only the following steps are required:

Instantiate a manifold \(\manM\) from the pymanopt.manifolds package to optimize over.
Define a cost function \(f:\manM \to \R\) to minimize using one of the backend decorators defined in pymanopt.function.
Create a pymanopt.Problem instance tying the optimization problem together.
Instantiate a Pymanopt optimizer from pymanopt.optimizers and run it on the problem instance.

Installation

Pymanopt is compatible with Python 3.8+, and depends on NumPy and SciPy. Additionally, to use Pymanopt’s built-in automatic differentiation, which we strongly recommend, you need to setup your cost functions using either Autograd, JAX <https://jax.readthedocs.io/en/latest/>_, TensorFlow or PyTorch. If you are unfamiliar with these packages and you are unsure which to go for, we suggest to start with Autograd. Autograd wraps thinly around NumPy, and is very simple to use, particularly if you’re already familiar with NumPy. To get the latest version of Pymanopt, install it via pip by specifying the backend(s) you want to use among autograd, jax, torch and tensorflow, separated by a comma. For example:

$ pip install "pymanopt[autograd]" # or [jax], [tensorflow], [torch]

A Simple Example

As a simple illustrative example, we consider the problem of estimating the dominant eigenvector of a real symmetric matrix \(\vmA \in \R^{n \times n}\). As is well known, a dominant eigenvector of a matrix \(\vmA\) is any vector \(\opt{\vmx}\) that maximizes the Rayleigh quotient

\[\begin{align*} f(\vmx) &= \frac{\inner{\vmx}{\vmA\vmx}}{\inner{\vmx}{\vmx}} \end{align*}\]

with \(\inner{\cdot}{\cdot}\) denoting the canonical inner product on \(\R^n\). The value of \(f\) at \(\opt{\vmx}\) coincides with the largest eigenvalue of \(\vmA\). Clearly \(f\) is scale-invariant since \(f(\vmx) = f(\alpha\vmx)\) for any \(\alpha \neq 0\). Hence one may reframe the dominant eigenvector problem as the minimization problem

\[\begin{align*} \opt{\vmx} = \argmin_{\vmx \in \sphere^{n-1}}\inner{-\vmx}{\vmA\vmx} \end{align*}\]

with \(\sphere^{n-1}\) denoting the set of all unit-norm vectors in \(\R^n\): the sphere manifold of dimension \(n-1\).

The following is a minimal working example of how to solve the above problem using Pymanopt for a random symmetric matrix. As indicated in the introduction above, we follow four simple steps: we instantiate the manifold, create the cost function (using Autograd in this case), define a problem instance which we pass the manifold and the cost function, and run the minimization problem using one of the available optimizers.

import autograd.numpy as anp
import pymanopt

anp.random.seed(42)

dim = 3
manifold = pymanopt.manifolds.Sphere(dim)

matrix = anp.random.normal(size=(dim, dim))
matrix = 0.5 * (matrix + matrix.T)

@pymanopt.function.autograd(manifold)
def cost(point):
    return -point @ matrix @ point

problem = pymanopt.Problem(manifold, cost)

optimizer = pymanopt.optimizers.SteepestDescent()
result = optimizer.run(problem)

eigenvalues, eigenvectors = anp.linalg.eig(matrix)
dominant_eigenvector = eigenvectors[:, eigenvalues.argmax()]

print("Dominant eigenvector:", dominant_eigenvector)
print("Pymanopt solution:", result.point)

Running this example will produce (something like) the following:

Optimizing...
Iteration    Cost                       Gradient norm
---------    -----------------------    --------------
       +1.1041943339110254e+00    5.65626470e-01
       +5.2849633289004561e-01    8.90742722e-01
       -8.0741058657312559e-01    2.23937710e+00
       -1.2667369971251594e+00    1.59671326e+00
       -1.4100298597091836e+00    1.11228845e+00
       -1.5219408277812505e+00    2.45507203e-01
       -1.5269956262562046e+00    6.81712914e-02
       -1.5273114803528709e+00    3.40941735e-02
       -1.5273905588875487e+00    1.70222768e-02
       -1.5274100956128560e+00    8.61140952e-03
       -1.5274154319869837e+00    3.90706914e-03
       -1.5274156215853507e+00    3.62943721e-03
       -1.5274162595152783e+00    2.47643452e-03
       -1.5274168030609154e+00    3.66398414e-04
       -1.5274168133149475e+00    1.45210081e-04
       -1.5274168150025758e+00    4.96142583e-05
       -1.5274168150483476e+00    4.42317042e-05
       -1.5274168151841643e+00    2.13915041e-05
       -1.5274168152087644e+00    1.36422863e-05
       -1.5274168152220804e+00    6.25780214e-06
       -1.5274168152229037e+00    5.48381052e-06
       -1.5274168152252021e+00    2.16996083e-06
       -1.5274168152255774e+00    7.52279600e-07
Terminated - min grad norm reached after 23 iterations, 0.01 seconds.

Dominant eigenvector: [-0.78442334 -0.38225031 -0.48843088]
Pymanopt solution: [0.78442327 0.38225034 0.48843097]

Note that the direction of the “true” dominant eigenvector and the solution found by Pymanopt differ. This is not exactly surprising though. Eigenvectors are not unique since every eigenpair \((\lambda, \vmv)\) still satisfies the eigenvalue equation \(\vmA \vmv = \lambda \vmv\) if \(\vmv\) is replaced by \(\alpha \vmv\) for some \(\alpha \in \R \setminus \set{0}\). That is, the dominant eigenvector is only unique up to multiplication by a nonzero constant; the zero vector is trivially considered not an eigenvector.

The example above constitutes the conceivably simplest demonstration of Pymanopt. For more interesting examples we refer to the examples in Pymanopt’s github repository. Moreover, this notebook demonstrates a more involved application of Riemannian optimization using Pymanopt in the context of inference in Gaussian mixture models.