by Andrea Giammanco

This is the end of my unfolding series, whose previous episodes can be found here, here, and here.

This post discusses when we should apply unfolding to our data, and when not.

There is an interesting thing about the unfolding community: when you ask an expert for practical advice, the expert typically starts by asking why you think you need unfolding, and will in general discourage you from doing it.

Some time ago, a mini-workshop on unfolding techniques was organized by a group inside CMS. I was quite interested in attending because my recent research interests in the top quark realm dragged me (kicking and screaming) into the unfolding business. Thanks to the plenty of data delivered by the LHC, top quark physics has recently entered the statistical regime where it becomes appealing to perform differential measurement (of cross sections or of other quantities, like asymmetries, as a function of kinematic properties), while other communities, e.g. soft-QCD experts, already measure stuff differentially since some generations.

The first speaker presented the key recommendations by the CMS Statistics Committee (a board of senior scientists with competence on various statistical problems, whose mandate is to check the statistical soundness of all our analysis procedures and give advice when any analyst requests it).
Recommendation number one: “We recommend to avoid unfolding when it is not deemed compulsory”.

The second talk was entirely devoted to the list of conceptual and practical issues with unfolding. The speaker, who authored several papers on unfolding techniques, strongly discouraged from unfolding and gave suggestions on how to avoid it altogether (e.g., he remarked that if you have a parameterized theoretical model you can just fit its parameters to data in the smeared space.)

The third talk, before giving several practical recipes (how to choose the optimal regularization criterion, how to treat systematic uncertainties, etc.), provided further examples where unfolding is a very bad idea.

Then came the fourth speaker, who has a long experience of unfolding from various experiments. And – guess what? – his first advice on unfolding was: “DO NOT DO IT” (in capital letters.) But then he presented and demonstrated a nice idea for “partial unfolding” (i.e., identifying only the degrees of freedom that our data are able to give info about, and unfold only those, while fixing the others to the model.)

Also the internal wikis of recommendations follow the same approach: they start by discouraging the reader from unfolding, then warn of the many pitfalls of unfolding, and finally give you practical recipes for unfolding.
I have never witnessed this attitude anywhere else, so I am not sure of the right metaphor to use. Maybe, before recruiting for the Crusades, the Middle Age preachers started by reminding that thou shalt not kill, before elaborating on how to genocide the infidels?

So when should you abstain from unfolding?

When you want to search for new physics (or, more generally, be sensitive to unexpected features in the data.)
The reason was already explained in my previous post. In short: any conceivable unfolding method necessarily biases towards the initial model, and anyway the sensitivity to deviations between data and expectation is decreased in the unfolded space with respect to the smeared space, because of binning effects (to minimize off-diagonal terms in the migration matrix, which are the source of all unfolding problems, bin width cannot go much smaller than the resolution) and because any attempt to properly cover the bias with a proper uncertainty will further reduce the sensitivity to the unexpected.

When you want to extract a parameter of the theory.
A recent example from the CMS top quark group makes nicely the point: this analysis extracted the top quark pole mass from a least-squares fit to a variable suggested by a theory paper, twice: by fitting the smeared-space distribution, and the unfolded-space distribution. The first is significantly more precise, for the same reasons as above.
(On the other hand, unfolding was not a pointless exercise in this case: although the procedure diluted the information on the parameter to which that variable is sensitive, the differential cross section as a function of that variable is interesting per se.)

Above: The $\rho_S$ variable in the smeared space (left) and in the unfolded space (right), from CMS-PAS-TOP-13-006.

Another example is this analysis, to which I personally contributed (although I did not put my hands in the unfolding machinery myself.)
Yes, I sinned: this analysis ends with the extraction of a parameter from an unfolded distribution.
The parameter is the forward-backward asymmetry (in an appropriate rest frame) of muons from the decay of top quarks produced singly by a charged-current process mediated by weak interaction (see Feynman diagram on the left, which shows the leading diagram for single top quark production by weak interaction).

This quantity can range in principle anywhere between -1/2 and +1/2, and under some assumptions it corresponds to exactly half of the degree of polarization of the top quark. The Standard Model predicts almost 100% polarization of the top quark produced this way, because of the fundamental feature of the charged-current weak interaction to only concern left-handed fermions and be blind to right handed ones. (The opposite for anti-fermions.)

This property had been actively used in previous studies of single top quark production at Tevatron and LHC, for example as an input to multi-variate techniques, but this was always pointed out as a conceptual weakness of those measurements, as it introduced a bias of the measured cross section towards the Standard Model assumptions. That’s why my own research program in the early years of LHC running has included the measurement of the inclusive cross section of this process by the exploitation of a kinematic property that does not correlate significantly with polarization, and the first measurement of the differential cross section as a function of a variable that maximally correlates with polarization (i.e., the paper linked above.)

You can see the smeared- and unfolded-space distributions below:

Above:  angular variable related to single top polarization in the smeared space (left) and in the unfolded space (right), from JHEP 1604 (2016) 073.

The normalized differential cross section (right panel in this figure), due in particular to the coarse binning that was necessary to make unfolding behave nicely, was not as sensitive to the asymmetry parameter as a simple template fit to the smeared data could have been. But for a template fit we should have assumed some model (e.g., a linear relationship between production rate and this variable, as in the Standard Model but with a free parameter), while here, in the first measurement ever of this distribution, we are providing much more: we are showing for the first time (instead of assuming) that the relationship is indeed linear.

(To be fair, there is no theory model that predicts anything different from linear. The Standard Model tells you that it is linear and also tells you the slope, while many hypothetical New Physics processes that could give the same final state would feature a complete lack of polarization in the production vertex, and therefore a null slope: just a flat dependence. A significant deviation from the SM slope would hint to a possible admixture of the weak interaction production with some of those hypothetical mechanisms. But what if everybody is wrong and there is, for example, a concavity in that distribution?)

Incidentally: as a cross check we also extracted the same asymmetry by a simple $2\times 2$ matrix inversion.
What was done in practice was literally what I described in the simple example of episode 1. No regularization is needed with two bins, and it was performed analytically, which I found very refreshing as I live in a world dominated by numerical methods.
Interestingly, it turned out to yield a less precise determination of the forward-backward asymmetry.

Less statistical power comes from the smaller “lever arm” of a measurement with two bins with respect to several bins. But it has also to be remarked that also the bias towards our expectation gets larger: the migration probability gets integrated over the (expected) underlying distribution within the bin and is therefore highly sensitive to the model (hence larger systematics are obtained, as estimated by varying the model parameters).

Now let’s go back to recommendation number one: “We recommend to avoid unfolding when it is not deemed compulsory”. So far I elaborated on why in many cases one should avoid unfolding. But when is unfolding “compulsory”?

We say that you should not unfold when you are interested in new physics, e.g., when your goal is to set constraints on the couplings of an extension of the Standard Model (like some Effective Field Theory that could manifest itself through deformations of the predicted shapes of some observables, rather than through spectacular bumps.)
On the other hand, if you are a theorist, you may simply have no choice: “raw” data from the LHC experiments are not open, although a fraction of them may be released after some years of embargo, while unfolded data are usually disclosed here as soon as the corresponding paper is published.

Similar considerations apply to the extraction of parameters of the Standard Model. For example, the most avid users of our differential measurements are probably the small teams that extract the Parton Distribution Functions (PDF) from the public data.
Sure, a single experimental collaboration could possibly fit PDFs to its raw data, but would not have access to the raw data of other experiments (not more than theorists). The best discriminating power is achieved by combining data coming from different accelerators ($pp$, $p\bar p$, $ep$, fixed-target experiments.) Some of those data were collected decades ago but are still relevant for PDF fits, because they were performed at different energies and probed different kinematic ranges.

Side note:
Usually, reanalyzing raw data from past experiments eventually becomes unfeasible even for the former members of those experiments. One of the few success stories (that probably started the slow movement of the HEP community towards the “open data” concept) is the resurrection of the data and of the analysis software of the JADE experiment, that took data at the PETRA collider at DESY between 1979 and 1986, using $e^+ e^-$ collisions in a range that we now call “low energy”, and was crucial for the establishment of QCD as the theory of strong interactions. In the late 90’s, just before the last backup got erased, someone realized that there would have been a lot to be learned about QCD by reanalyzing those data, profiting from the theory advances and from analysis methods that had been developed in the meantime. The heroic story of how those data were recovered is narrated, for expert readers, here and here.
End of side note.

Another use case, and an obvious one (although for some reason it is not often discussed), is the mere comparison of different experiments.
ATLAS and CMS are very different detectors, our event reconstruction algorithms are different, our selections are optimized independently; so, even if we measure the same underlying distributions, we “fold” them very differently. Only unfolding allows to compare the spectra in a meaningful way, like for example:

Ideally one would combine these two spectra (which is in the plans, but it takes time to do it right), then compare to the theory predictions and see how good they are. And one needs unfolded data for that. But even without a combination we already like to show this plot around, as an agreement among the experiments has a value per se. For example, a discrepancy in shape would hint that some systematic effect is unaccounted somewhere.

Historically, when the first measurement of this distribution was made public, there was no Next-to-Next-to-Leading-Order (NNLO) QCD prediction yet, and it was observed that the discrepancy of CMS data with Next-to-Leading-Order (NLO) calculations had a different direction with respect to what most “educated guesses” expected.

(By the way, the computation of differential spectra at NNLO in QCD – which was achieved for the first time in the top-pair case only quite recently – is so heavy that we have to agree with the authors beforehand about the exact bins to be used, because it takes weeks or months for their machines to deliver.)

Only when ATLAS released their unfolded spectra at the same energy (an independent data set, a very different detector, and an unfolding technique from the other major school of thought) we got more confident that the reason was not an unaccounted systematic, a bug in our code… or a figment of our unfolding’s imagination!