PCA on projective space

Projective spaces are spaces of lines in some ambient space. For example, {\mathbf R}P^2 is the space of straight lines in {\mathbf R}^3. What is this space? Well, it is the space you get when you identify points in {\mathbf R}^3\backslash{0} up to the multiplicative group x \rightarrow \lambda x, \lambda\neq 0. In this particular case, it is the upper hemisphere of {\mathbf S}^2, with the boundary circle points identified. All good stuff which Brock Fuller and Jack Conn taught me a long long time ago.

But now we’re dealing with trying to do something like PCA on projective space, so we want to ask: What is PCA in a linear space and why do we care? We’re trying to find the directions along which some cloud of points is most spread out. In particular, the rotation symmetry in this space is what underlies wanting to diagonalize \langle (X_i - \langle X_i\rangle)(X_j - \langle X_j\rangle)\rangle. So on a projective space, we still have lots of symmetry, and we should be using this symmetry to find something like a great circle along which the cloud of points in projective space is most spread out. The distance measure inherited from the underlying ambient space has the same rotation symmetry (except for a minor caveat about the identification of antipodal points, which affects the group by modding out a discrete Z_2).

It’s not obvious to me how to do this as some sort of simple matrix manipulation, but as a maximization problem, we want to find a rotation in the ambient space such that the points in the projective space have the largest possible range of \phi\in [0,\pi). (Remember the identification of x and  -x.) Then the other directions have to be determined by rotations that don’t change the inverse image of this half-circle in the ambient space and so on. (Might be better to have some different orientation standard but this is more a matter of notation and calculation.)

Each line is represented by a direction vector. A 2-plane in the original space defines a circle on the projective space. A 2-plane could be defined by two data points in the original space that maximize the projective distance between pairs of data points. Obviously, picking two specific data points is too susceptible to outliers so we want something like two orthonormal vectors n_1, n_2 such that the mean distance of the data points from the plane defined by these vectors (which is the angle between the data point and its projection on the plane) is minimized and the standard deviation of the angles between the projections of the points in this plane is maximized. Just implement the second cost:

n_{1j}n_{1k} +n_{2j}n_{2k}

Of course, there are other sets of charts on the projective space, but the metric distances are not so trivially calculated. It is interesting that this projective PCA would reduce the dynamic range. Is this good? There are instances in image processing where I imagine that the important thing is not the absolute intensity but the relative intensity. I wonder if one would get different results from the Pearson correlation. The Pearson correlation normalizes the mean and the standard deviation before computing the cosine of the angle between the two vectors. The projective distance is scale-independent but does take the mean of the two vectors into account. PCA can be generalized in many ways. There are nonlinear manifold PCA methods, neural network PCA methods and so on.