My first post in this blog was about the use of Multivariate Analysis (MVA) to enhance the discrimination between signal and background processes in Higgs pair production in the final state with four bottom quarks. I would like now to discuss another very important application of MVAs, namely the unbiased parametrization of unknown multi-dimensional physical quantities.

In the case at hand now, these quantities are the parton distribution functions of the proton, or PDFs for short. A crucial aspect of the physics program at the LHC is based on providing accurate theoretical predictions for all relevant processes such as Higgs production or Standard Model benchmark cross-sections.

An important ingredient of these predictions are the proton PDFs, which encode the dynamics determining how the energy of the protons that collide at the LHC is split among its fundamental constituents, the quarks and gluons. As opposed to lepton colliders, where the center-of-mass energy of the collision is fixed by the lepton beam energy, at hadron colliders only the hadronic center-of-mass energy is known beforehand, from the energy of the proton beams.

But protons, unlike electrons, are not fundamental quantities, and are composed by quarks and gluons. Therefore, the LHC more than a proton collider is a quark and gluon collider. This implies that to determine the outcome of any LHC collision, and to know which is the partonic center-of-mass energy, we need to know how the energy of the colliding protons is shared among quarks and gluons.

The bad news is that PDFs are intrinsically non-perturbative objects, and thus their computation from first principles in Quantum Chromodynamics (QCD, the theory of the strong interaction) is so far out of reach (aside: despite the claims that the Standard Model is a well-understood theory, this is very far from true. For instance, we have no clue about why the proton, one of the most common particles in the universe, has the mass it has. And no, it is (mostly) unrelated to the Higgs mechanism, since the proton mass arises from non-perturbative QCD dynamics).

Therefore, PDFs need to be extracted from experimental data, using the so-called global analysis framework. This requires combining many different measurements with state-of-the-art perturbative QCD calculations and a robust statistical methodology, in order to construct PDF sets that meet the requirements of the LHC phenomenology.

Given our limited understanding of non-perturbative QCD dynamics, the shape of the parton distributions xfi(x,Q20) at the input parametrization scale Q0 (which is typically of the same order of the proton mass) should be as general as possible, ensuring that it is flexible enough to be able to describe all experimental data included in the fit. Here, we denote by x the fraction of the total proton momentum carried by each quark and gluon in the proton, and Q2 is the typical energy scale of the partonic collision.

In most cases, the following ansatz is adopted,

Screen Shot 2016-03-09 at 17.53.25,

where denotes a given quark flavour (or flavour combination) or the gluon.

The normalisation fractions Af_i, the exponents af_i and bf_i, and the set of parameters {cf_i} are determined from the data. Some of the Af_i  can be fixed in terms of the other parameters by means of the sum rules, direct predictions of QCD. In addition, the PDF parametrization should implement the kinematic constrain that in the elastic limit xfi(x →1,Q20) → 0.

Once the PDFs are determined at the initial scaleQ0, their behaviour for any other values of the factorization scale Q is determined via the QCD evolution equations.

The crux of the problem is thus how to parametrize the (unknown) dependence of the PDFs in the momentum fraction x. The simplest solution, like the use of low-order polynomials, was adopted in the earlier PDF fits, but it is clear that the use of ad-hoc parametrizations is prone to the introduction of theoretical biases, which would affect the interpretation of the LHC results.

With the motivation to overcome some of the limitations of existing approaches, we have developed a novel approach to the determination of PDFs based on Machine Learning techniques, the so-called NNPDF approach.

In this framework, PDFs learn the underlying physical law from experimental data without the need of imposing any prior knowledge, just as we learn how to score penalty kicks without the need of solving Newton’s equations of motion beforehand.

As a representative example of the NNPDF results, in Fig. 1 we show the NNPDF3.0 global analysis of parton distributions, from the Particle Data Group 2016 review. The different quark and gluon PDFs, including their uncertainty bands, are represented at two different values of the resolution scale Q2: higher values of Q2 imply that one is probing the proton at smaller distances.

Fig. 1: The NNPDF3.0 global analysis of parton distributions, from the Particle Data Group (PDG) 2016 review. The different quark and gluon PDFs, including their uncertainty bands, are represented at two different values of the resolution scale Q2: higher values of Q2 imply that one is probing the proton at smaller distances.

It is beyond the scope of this post to review in detail the complete NNPDF approach, so I will restrict myself to one of its central ingredients, the use of Artificial Neural Networks as universal unbiased interpolants.

As I mentioned, the basic problem in PDF fits is how to parametrize the functional dependence of the PDFs in the parton momentum fraction x. In the NNPDF approach, we use Artificial Neural Networks (ANN) to parametrize this dependence.

This is motivated by a number of theoretical results that ensure that, given enough neurons in the inner layers, ANNs can reproduce any functional behaviour that is indicated by experimental data, no matter how wobbly or complicated this is.

In Fig. 2 we show the schematic representation of an artificial neural network with a single hidden layer. In particular, the NNPDF fits use feed-forward multi-layer neural networks, also known as perceptrons.

Fig. 2: Schematic representation of an artificial neural network (ANN) with a single hidden layer. This type of feed-forward multi-layer neural network is also known as a perceptron.

A crucial point of the NNPDF procedure is the so called training phase, where the ANN has to learn the underlying physical laws from the experimental measurements. This is achieved by determining a measure of the quality of the agreement between data and theory (for a given set of input PDFs), typically with a χ2 estimator, and then using a suitable minimization method to find the parameters that maximize this agreement.

In Fig. 3 we show an example of the training phase of an ANN, divided into two steps, the first one when a deterministic minimization algorithm (back-propagation) is used, and the second when a random minimization algorithm is used instead (Genetic Algorithms, or GA for short). The right plot corresponds to a detail of the GA phase.

Fig. 3: An example of the training phase of an ANN, divided into two steps, the first one when a deterministic minimization algorithm (back-propagation) is used, and the second when a random minimization algorithm is used instead (Genetic Algorithms, or GA for short). The right plot corresponds to a detail of the GA phase.

Genetic Algorithms (which could be the subject of several posts!) are non-deterministic minimization strategies suitable for the solution of complex optimization problems, for instance when a very large number of quasi-equivalent minima are present. GAs are inspired by natural selection processes that emulate biological evolution.

The use of ANNs to parametrize PDFs can be shown to have several important advantages. To begin with, results can be shown to be extremely robust to the specific choice of ANN architecture. In a particularly impressive example, NNPDF fits use 2-5-3-1 networks with a total of 37 free parameters each (and there are seven independent PDFs at the input scale Q20).

Now, if we increase by a factor 10 the number of parameters in each ANN, to around 400 parameters, the results of the fit are unchanged: this means that PDFs are determined exclusively by data, as they should, and not by our a priory theory bias.

Other advantages include a characteristic blow-up of the PDF uncertainties in regions with very limited constraints: this indicates when PDFs stop being trustable for precision physics.

This result might seem unremarkable, until one realizes that, if PDFs are parametrized by simple polynomials, the uncertainties in the extrapolation regions will be similar to those of the data region, with potentially worrisome implications on measurements and searches based on these PDFs (since the PDF uncertainty will be artificially smaller than what it should be).

This post has attempted to present another application of MVAs in the context of high-energy physics, namely the unbiased parametrization of unknown physical quantities.

In a shortly coming post I will discuss why this unbiased parametrization is so crucial for the LHC physics program, and explore the implications in other domains usually thought to be orthogonal to collider physics. In particular, I will discuss the connection between PDFs at very small values of the parton momentum fraction x and the recent discovery by the IceCube experiment of very high energy neutrinos from astrophysical origin.

(Written by J. Rojo)