Hej! Work’s starting to pick up here at LIP-Lisbon as we begin to think about Monte Carlo sample production.

Monte Carlo (MC) generators are an important tool for us particle physicists since they allow us to simulate the particle collisions which occur at colliders like the LHC, whilst having access to the entire record of processes (MC truth). This allows us to determine background contributions to data, design new detectors, or, as we at AMVA4NP will make great use of, test and optimise selection algorithms.

In essence, a MC generator uses a combination of stochastic methods, particle theory, experimental measurements, and empirical modelling to simulate particle collisions. This is done by calculating the cross-section (probability) of a particular scattering process by factorising the calculation into terms relating to: the parton momenta; factorisation and renormalisation scales; final-state phase-space; the parton-density functions (PDFs) and the centre-of-mass energy of the incoming hadrons; and the matrix element (effectively the sum over all Feynman diagrams for the chosen scattering process).

For full generation, four steps are necessary: PDF sampling, matrix element (ME) calculation, parton showering, and hadronisation.

Difficulties arise in simulating QCD interactions due to the self-coupling nature of the generators of QCD. This means that particle collisions can quickly lead to many gluons and quarks being formed; a parton shower, which is incredibly difficult to calculate using QCD. By focussing on high-momenta parton-processes, QCD can be simplified into perturbative QCD (pQCD), in which the QCD generators are interpreted as gluons.

PDF sampling

Since pQCD is only valid for partons, and the LHC collides hadrons, the parton-densities of hadrons are obtained through experimental study and summarised as PDFs. These may then be ‘sampled’ to provide partonic input to the pQCD calculation.

The matrix element

This is an analytical calculation of the parton-scattering process in pQCD. This may take place at leading order (LO) using only the simplest Feynman diagrams (tree-level), or at higher orders such as next-to-leading order (NLO) by including the contribution of loop corrections. The order to which one may calculate a process is limited by processing power; loop-corrections are computationally intensive and difficult to automate.

The parton shower

Instead of suffering the high computation times associated with moving to higher-order calculations, partonic evolution may be modelled empirically through a process labelled the parton shower (PS). Here the partons from the ‘hard process’ calculated in the ME follow an evolution pattern prescribed by probabilistic functions dependant on some kinematic property (z) of the partons, say the energy fraction of the emitted parton with respect to the initial parton.

Evolution proceeds by generating splittings/emissions at a scale x^2, which is some evolution variable, say p_\perp^2. Evolution ceases for a parton once x falls below some cut-off, x^2_0, at which point branches are no longer resolvable as producing two distinct partons.

The exact choice of z and x varies between showers, and if an infinite number of perturbation terms were calculated then all choices would give the same result. Since we only calculate a finite number of terms, different choices of variables do produce different results.

Choice of variable can be supported by showing that it provides a reasonable approximation when higher-order perturbation terms are calculated, or that it doesn’t produce un-physical results like negative cross-sections.

Showers also differ in their ordering of emissions, with some producing the hardest scattering first (p_\perp-ordered), and others producing the widest-angle emission first (angular-ordered). Angular-ordered showers have the advantage of improved colour-coherence, but are more difficult to connect to MEs (see below).

Modern showers use splitting kernels based on emission from colour-anticolour dipoles, a 2→3 process, which allows momentum to be explicitly conserved as opposed to showers based on 1→2 splittings, where a momentum reshuffling process is required to restore momentum conservation.

It should be noted that the default splitting-functions use a massless approximation for partons; the fun begins when we try to account for the masses of heavy-flavour quarks. PS accuracy relies on expansion from the collinear singularity (as the angle between emitted partons approaches zero, the probability of emission increases towards infinity); once mass is introduced, this singularity is screened by kinematic thresholds, reducing the accuracy of the PS approximation for narrow-angle emissions.

Generators such as Sherpa and Herwig++ use ‘quasi-collinear’ splitting functions, where the splitting functions are generalised to include quark-mass effects. Pythia, another generator, instead uses ME information to correct the splitting calculated in the massless approximation.

PS modelling of heavy-flavour production was the focus of my masters at Glasgow, so perhaps I’ll write a post about that next time.

ME and PS combination

Both the ME and PS are viable methods of simulating QCD processes and contemporary generators use both processes: the ME is best for hard, wide-angle, low-multiplicity splittings; the PS best for soft, narrow-angle splittings.

This combination takes place by matching the ME to the PS; using the kinematics of the ME partons to seed a PS. It should be noted that whilst this is technically ‘matching’, the field has progressed to the point that this process is assumed and matching is now taken as referring to a more advanced process described below.

When inclusive (varying number of outgoing partons) MEs are calculated, merging procedures are required to avoid double-counting an emission. As an example, Sherpa’s merging algorithm splits the emission phase-space into regions of ME and PS production and then truncates the PS such that it only produces within its assigned phase-space region.

Difficulties also arise when greater-than-leading order MEs are calculated. Here advanced matching-procedures such as MC@NLO or Powheg are required. The state of the art for MC generation involves generating NLO MEs with varying final-state multiplicity and matching them to a PS using approaches such as MEPS@NLO.


The x_0^2 cut-off used in the PS determines the point at which QCD evolution moves into the non-perturbative regime. Currently there is no evidence for free colour-charge, so some process is required to bind the coloured partons into colourless hadrons.

This process is referred to hadronisation and is based on observations of QCD. Two common models exist to perform this: the string model, based on lattice QCD observations in which the gluon fields collapse into thin tubes; and the cluster model, based on the idea of pre-confinement, where partons are grouped such that they form clusters with no net colour-charge. Hadrons exiting the hadronisation stage are then allowed to decay down to the stable final-states one may observe in a detector.

Needless to say, this has only been a very basic overview of how Monte Carlo generators function. For further reading, I’d recommend this review paper. Have a great week!