by Juan Rojo

Juan Rojo is a member of the Oxford University node of the AMVA4NewPhysics network. In this post he presents a recent study carried out by a group of theoretical and experimental physicists of the Oxford University. 

Given the advanced nature of the topic, all readers are highly invited to ask for any kind of clarification and explanation in the comments’ section. Juan will be happy to answer all your questions!

The measurement of double Higgs production will be one of the central physics goals of the LHC program in its recently started high-energy phase, as well as for its future high-luminosity upgrade (HL-LHC).

Higgs pair production is directly sensitive to the Higgs trilinear coupling and provides crucial information on the electroweak symmetry breaking mechanism. It also probes the underlying strength of the Higgs interactions at high energies, and can be used to test the composite nature of the Higgs boson.

Analogously to single Higgs production, in the Standard Model (SM) the dominant mechanism for the production of a pair of Higgs bosons at the LHC is gluon fusion, see Fig. 1 for representative leading-order Feynmann diagrams. For a center-of-mass energy of √s = 14 TeV, the next-to-next-to-leading order (NNLO) total cross section is approximately 40 fb, so therefore around 12000 events with Higgs pairs are expected per LHC experiment at the end of Run II, and a factor 10 more at the end of the HL-LHC.

Fig. 1: Representative Feynman diagrams for Higgs pair production in gluon fusion at leading order. Only the fermion triangle loop diagram (right) is directly sensitive to the Higgs trilinear coupling. In the SM, the fermion loops are dominated by the contribution from the top quark.

Feasibility studies in the case of a SM Higgs boson in the gluon-fusion channel at the LHC have been performed for different final states, including bbγγ, bbτ+τ, bbW+W, and bbbb. The main advantage of the bbbb final state is the enhancement of the signal yield from the large branching fraction of Higgs bosons into bb-pairs, BR(H→bb)≅0.57. On the other hand, a measurement in this channel needs to deal with an overwhelming QCD multi-jet background. Previous studies of this final state estimate that, even at the HL-LHC, it will be very difficult to observe Higgs pair production.

Very recently, within a collaboration of theorists and experimentalists from Oxford, we have revisited the feasibility of SM Higgs pair production by gluon-fusion in the bbbb final state at the LHC in a new study. The authors include members of the Oxford node of the AMVA4NP network, Daniela Bortoletto, Cigdem Isserver and myself, as well as three postdocs, Katharina Behr (who just moved to DESY), James Frost, and Nathan Hartland.

In our analysis, the selection is divided into three different categories, depending on the event topology: the resolved (with four well separated b-jets), boosted (with two fat jets containing each the decay products of a Higgs boson) and intermediate category. These three categories are optimized separately and then combined.

There are several improvements compared to previous works, including a detailed simulation of the background contamination from light jets mis-identified as bottom-quark jets, and the assessment of how the high pileup (number of collisions per bunch crossing) conditions expected at the HL-LHC degrade the results.

From the methodological point of view, the main difference is that our analysis is based upon a combination of traditional cut-based methods and multivariate analysis (MVA), in particular Artificial Neural Networks (ANN). Multivariate techniques are by now a mature tool in high-energy physics data analysis, opening new avenues to improve the performance of many measurements and searches at high energy colliders. In particular, the classification of events into signal and background processes by means of MVAs is commonly used in LHC applications.

The specific type of MVA that we used in our work is a multi-layer feed-forward artificial neural network (ANN), known as a perceptron, and also sometimes as deep neural network. In Fig. 2 you see an illustrative example of one of the ANNs used, with a total of Nvar = 21 input variables. This type of ANN is the same as the ones used to parametrize the parton distribution functions (PDFs) of the proton in the NNPDF global analyses, of which myself and Nathan are also members.

Fig. 2: Schematic of the Artificial Neural Network (ANN) used for the analysis of the boosted category, with Nvar=21 input variables and thus the same number of neurons in the first layer. The color code in the neuron connections (the weights) is a heat map obtained at the end of the Genetic Algorithms training, with red indicating larger values and black indicating smaller values.

The MVA inputs are a set of kinematic variables describing the signal and background events that satisfy the requirements of the cut-based analysis. The output of the trained ANNs also allows for the identification, in a fully automated way, of the most relevant variables in the discrimination between signal and background.

The training of the neural networks consists in the minimization of a suitable figure of merit, in this case the so-called cross-entropy error function, to maximize the discrimination between signal and background events. This training is performed using Genetic Algorithms (GA), non-deterministic minimization strategies suitable for the solution of complex optimization problems, for instance when a very large number of quasi-equivalent minima are present.

GAs are inspired by natural selection processes that emulate biological evolution. To avoid the possibility of over-fitting, we used a cross-validation stopping criterion. This cross-validation proceeds by dividing the input Monte Carlo (MC) dataset into two disjoint sets, and to use one of them to train the ANN and the other one for validation: the optimal stopping point is then given by the minimum of the error function to the validation sub-sample. This indicates the point where the ANN begins to train upon statistical fluctuations in the input MC samples, rather than learning the underlying (smooth) physical distributions.

In Fig. 3 we show the distribution of the ANN output at the end of the GA minimization, in the case of the boosted selection. The separation between signal and background is achieved by introducing a cut, ycut, on the ANN output, so that MC events with yi ≥ ycut are classified as signal events, and those with yi < ycut as background events. Therefore, the more differentiated the distribution of the ANN output is for signal and background events, the more efficient the MVA discrimination will be.

Fig. 3: The distributions, at the end of the GA training, for the signal and background MC events in the boosted category, as a function of the ANN output.

As we see, the algorithm achieves a very good separation between the two types of events. The main results of our study for the case of the HL-LHC are collected in Table 1, where we show the signal significance, S/√B, and signal-over-background ratio, S/B, before the MVA is applied (ycut = 0) and after the optimal MVA cut is applied in each category.

Screen Shot 2016-01-08 at 18.00.07
Table 1: The signal significance, S/√B, and signal-over-background ratio, S/B, at the HL-LHC, before the MVA is applied (ycut=0), and after the optimal MVA cut is applied in each category.

The most remarkable result is the substantial improvement in signal significance when going from the purely traditional cut-based analysis to the final results including also MVA: for example, in the resolved category the significance increases from 0.4 to 2.0. Also very important, the signal over background ratio is increased by two orders of magnitude.

To summarize, multivariate techniques have the potential to improve the signal significance for processes with complicated final states, such as hh→4b, as compared to traditional cut-based analyses. Our study not only illustrates how the 4b final state should be enough to observe Higgs pair production at the HL-LHC, but, even more remarkably, demonstrates that, provided the signal selection efficiency and background rejection can be improved, there might be even some hope for Run II.

However, ours is only a phenomenological feasibility study: the real challenge, the actual measurement of the hh→4b process by ATLAS and CMS, will take several years. But at least we have shown that we have many reasons to be optimistic!

(Written by J. Rojo)