by Andrea Giammanco

In my previous post I drove you from the safe land of The Truth to the so called “smeared space” (i.e., what is actually observable), where things are murky and brown, and then back to the “unfolded space”, which resembles your point of departure, but with some noise due to the amount of information that was lost in the process.

I started with a two-bin measurement that only demands to invert a 2×2 matrix. The most advanced mathematics involved, in that case, is this cute formula:

This.

When the determinant is zero, the matrix is not invertible and it is called “singular”. An example of a matrix that you cannot invert is:

$\displaystyle A\times M = \left( \begin{array}{cc} 0.5 & 0.5 \\ 0.5 & 0.5 \end{array}\right)$

In practice, I doubt that you will find yourself very often in need of inverting a matrix like that in an unfolding problem. If you think a bit like a physicist before embarking blindly into arithmetics, you will notice that this is a very sick matrix. In fact, this matrix means that 50% of the truly “forward” events are actually observed in the “backward” direction, and vice versa. If your detector or your reconstruction algorithm induce such an extreme dilution of the information, you should definitely abstain from unfolding.

There are other sick cases that make the matrix singular, but it is a fair bet that you will not meet them often.

And what happens if the matrix is almost singular, i.e., the determinant is close to zero but not exactly zero? The inversion can be done, but the closer to zero it is, the less reliable the result gets.

The inverse of the nicely behaving $\left( \begin{array}{cc} 0.8 & 0.2 \\ 0.2 & 0.8 \end{array}\right)$ is $\left( \begin{array}{cc} 1.33 & -0.33 \\ -0.33 & 1.33 \end{array}\right)$.

The inverse of $\left( \begin{array}{cc} 0.6 & 0.4 \\ 0.4 & 0.6 \end{array}\right)$ is $\left( \begin{array}{cc} 3 & -2 \\ -2 & 3 \end{array}\right)$.

The inverse of $\left( \begin{array}{cc} 0.51 & 0.49 \\ 0.49 & 0.51 \end{array}\right)$ is $\left( \begin{array}{cc} 25.5 & -24.5 \\ -24.5 & 25.5 \end{array}\right)$.

Do you see the trend? The elements in the inverse matrix become larger as the matrix approaches singularity. This is due to the fact that the determinant ad-bc, which is the denominator, gets smaller and smaller. So when you apply this inverse matrix to your vector of observed data (“smeared space”) you are applying a larger and larger correction to go to the “unfolded space” that is supposed to represent The Truth. The larger an extrapolation is, the more it is affected by tiny perturbations (e.g., the statistical noise.)

This is the crucial point of the problem.

But this was just an unrealistic example: in 2×2 inversions one gets into such troubles only if the resolution is so poor that probably no measurement would have been attempted in any case.

Now, let’s go to larger numbers of bins. Let’s say that you are measuring a spectrum that extends over a fairly large range, and you have plenty of data so you can afford to discretize the spectrum into a large number of bins, and all bins are decently populated. The more bins you have, the more features of the spectrum you can study and compare to theory or other experiments, so go for it.

Start of side remark.
Beware: a common refrain says that, ideally, you should not unfold data to compare to theory (in the “unfolded space”), but fold the theory to be comparable to data (in the “smeared space”). This is an excellent advice and I recommend to follow it whenever feasible; the only problem is that there are plenty of cases where it is unfeasible, or you have a good reason to unfold anyway (e.g., comparing with a different experiment, whose “folding” is necessarily very different). This will be the subject of a future post in this series.
End of side remark.

Remember the 2×2 migration matrix of my previous post (and for simplicity let’s just assume that the A matrix is the unity matrix, i.e., that it has no practical influence on the calculations):

$\displaystyle M = \left( \begin{array}{cc} 0.8 & 0.2 \\ 0.2 & 0.8 \end{array}\right)$

Let’s make it part of a larger matrix:

$\displaystyle M = \left( \begin{array}{ccccc} 0.80 & 0.20 & 0 & 0 & 0\\ 0.20 & 0.80 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 1\\ \end{array}\right)$

Inversion, unsurprisingly, is ok.

Now, let’s twist things a little bit:

$\displaystyle M = \left( \begin{array}{ccccc} 0.80 & 0.20 & 0 & 0 & 0\\ 0.20 & 0.80 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 0.5 & 0.5\\ 0 & 0 & 0 & 0.5 & 0.5\\ \end{array}\right)$

Before doing the calculation, you already notice at a glance that one of its sub-matrices (the bottom-right corner) is singular, and this dooms the overall matrix to have a null determinant, hence being singular itself.

And this is also true if you swap some columns or rows, e.g.:

$\displaystyle M = \left( \begin{array}{ccccc} 0.80 & 0.20 & 0 & 0 & 0\\ 0.20 & 0.80 & 0 & 0 & 0\\ 0 & 0 & 0.5 & 0 & 0.5\\ 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0.5 & 0 & 0.5\\ \end{array}\right)$

Singularity can sneak into your matrix in many cunning ways.
Also this one is singular:

$\displaystyle M = \left( \begin{array}{ccccc} 0.80 & 0.20 & 0 & 0 & 0\\ 0.20 & 0.80 & 0 & 0 & 0\\ 0 & 0 & 0.3 & 0.1 & 0.4\\ 0 & 0 & 0.1 & 0.3 & 0.4\\ 0 & 0 & 0.4 & 0.4 & 0.8\\ \end{array}\right)$

Would you have noticed at first glance? I constructed it to be singular by just ensuring that the sum of the third and fourth column (and of the third and fourth row) is the fifth column (the fifth row). So to notice you should have performed some check. In fact, also this one is singular:

$\displaystyle M = \left( \begin{array}{ccccc} 0.80 & 0.20 & 0 & 0 & 0\\ 0.20 & 0.80 & 0 & 0 & 0\\ 0 & 0 & 0.3 & 0.2 & 0.4\\ 0 & 0 & 0.1 & 0.6 & 0.4\\ 0 & 0 & 0.4 & 0.8 & 0.8\\ \end{array}\right),$

because the last three columns are described by a linear relationship (three objects x, y, z are in a linear relationship if you can write x=ay+bz).

And all this is about singularity and exact invertibility but, as I was stressing before for the simple 2×2 matrix, numerically you are in trouble whenever you get close to singularity.

The larger you make your matrix, the more opportunities there are that by pure chance some sub-matrix becomes near-singular, also considering columns that are not even contiguous to each other.

A figure from this article by Glenn Cowan illustrates an example of a seemingly innocuous problem that just blows your face:

The left plot shows The Truth, the middle plot the smeared histogram (solid line: before stochastic noise; dashed line: with stochastic noise), and the rightmost plot is the unfolded histogram. Notice the vertical scale, and the wild oscillations. Remember that the histogram to the right is supposed to be an estimation of the histogram on the left. What is happening is that the stochastic noise has been amplified by the large elements of the inverted matrix.

Now, if I managed to make you anxious about near-singularities hidden in your migration matrices, don’t worry because in the third episode of this series I will give you some hints.