Parton-Level Cross Section Interpolation

Interpolate the relationship between a (parametrised) underlying distribution of particles and the probability of such an event occurring in a collider.

Hosted by

CERN

Authors: Michele Grossi, Enrico Enrico Bothmann, Zenny Wettersten

Overview

Although there are many stages to High-Energy Physics (HEP) predictions, the fundamental physics are described in terms of probabilities (cross sections) calculated from perturbative Quantum Field Theories (QFTs). However, since HEP processes are stochastic, it is insufficient to evaluate the total cross section --- rather, to compare statistically compare experiment with theory we need to sample the underlying distribution, known as the differential cross section $d\sigma$ (also known as the scattering amplitude or matrix element of the process).

In this challenge, your goal is to build a quantum (or hybrid classical-quantum) algorithm to evaluate this differential cross section at a given phase space point. For the implementation, you are given as input:

(Parametrised) phase space points containing the kinematic information of the particles in a given event.

Your task is to match this to the output:

Leading Order (LO) differential cross section at this phase space point.

As this is an interpolation/regression problem, a standard choice of performance metric would be the mean squared error (MSE). In order to more accurately capture the shape of the distribution, though, you might instead choose to use the mean absolute percentage error (MAPE), although it should be noted that the differential cross section is small across large swaths of the phase space.

Background

Monte Carlo Event Generation in HEP

Although there are analytic formulae for the differential cross sections ( $d\sigma$ ) in HEP, there are several issues that arise when trying to connect these to experimental measurements; there is (generally) no closed form solution for the integral of

d\sigma

; the formulae scale badly with both the number of particles and the perturbative order of the calculation; and the stochastic nature of QFT means we need to compare experiment statistically with samples following the underlying phase space distribution of

d\sigma

Since we cannot evaluate the integrated cross section directly, we need to use numerical methods, and since we do not a priori know it we need to also determine the phase space distribution of

d\sigma

. To solve these problems, we use Monte Carlo methods in so-called event generators; software that generates phase space points for given interactions at the rate theory predicts we should observe them. In simple terms, event generators work in three steps:

Calculate the total cross section using Monte Carlo integration.
Using the calculated cross section, estimate the underlying distribution across phase space.
Sample this distribution at the rate theory predicts interactions would occur.

Due to the complexity of event generation, it is to this day an active area of research.

Phase Space Distributions

The level of individual particles, the parton level, is described in terms of the 4-momenta of all particles that are part of the interaction. Two degrees of freedom disappear from conservation of energy and momentum, meaning for a process with

n

partons we need to integrate

d\sigma

over a

(4n-2)

-dimensional space. However, due to e.g. Lorentz invariance, for analysis we typically project this higher-dimensional space onto key kinematic variables (such as the transverse momentum

p_T

, the pseudorapidity

\eta

, or the azimuthal angle

\phi

Most importantly for this challenge is that we can parametrise the phase space in lower dimensions. Not only can we relate phase space points to a lower-dimensional parameter set; we can also parametrise aspects of the distribution into this parameter set such that the parameters do not showcase the 'spiky' nature of

d\sigma

across phase space, where we see singular peaks around process resonances and near-zero values far from them.

Challenge

Participants are tasked with building a quantum (or hybrid classical-quantum) algorithm for differential cross section regression, mapping a (parametrised) phase space point to the scalar value of

d\sigma

at that point. Four sets of samples corresponding to the processes

q \bar{q} \rightarrow Z^0 + n \; \text{jets},\quad n = 0,\dots,3,

and the evaluated LO differential cross section

d\sigma

at those phase space points are provided. Note that

d\sigma

varies between the four processes, and that the analytic formulae quickly grow in complexity with

n

The samples have been generated with the Pepper event generator, and the phase space points are given by the internal parametrisation used by the Chili phase space sampler. Although these variables are related to kinematic observables, they are not identical --- you may think of them as a flatter mapping of the kinematics.

Dataset

The samples are provided in the HDF5 format compromising two datasets:

randomNumbers, the parametrised phase space points (differing in dimensionality with $n$ ).
events, the resulting event at that point as evaluated using Pepper. The only relevant entry here is the final element of each event, which is the leading order $d\sigma$ evaluated at that phase space point.

Another four sample sets, excluding the events datasets, are provided for performance evaluation of your algorithms.

Problem Statement

Your task is to set up estimators for

d\sigma

based on the sample sets, one estimator for each set. How to go about this is left up to your own interpretation; the dimensionality of the problem is constant for each value of

n

, and different methods may perform differently for each dimensionality.

When developing and testing your algorithm, you should take the following in mind:

Accuracy: Does your algorithm predict $d\sigma$ well?
Robustness: Does your algorithm perform well across different regions of the phase space?

Additionally, the analytic (classical) solution grows factorially in complexity with respect to the number of partons. Can you tell whether the same seems to hold true for your solution?

Evaluation

Alongside the labelled samples you are to build your algorithm on, each process also has an unlabelled sample containing parametrised phase space points but not the evaluated

d\sigma

Run your algorithm on these phase space points and store the results according to the submission format below. The performance of your algorithm(s) will be evaluated with the mean squared error.

Submission format

You must submit a file (e.g., submission.csv) with one row per event in the test set, for each subfolder dataset containing:

EventID – The integer position of the event in the original hdf5 file.
$d\sigma$ – Predicted differential cross section for that event.

Notes for Participants

The challenge has been structured to balance difficulty and feasibility, making it accessible to participants with diverse expertise.
The problem is rooted in real-world applications of Monte Carlo simulations and event generators used in high energy physics research.
Finalist will be asked to provide the solution (quantum algorithm and details) for a comprehensive evaluation.

We look forward to seeing innovative quantum solutions to this complex and exciting problem!

FAQ

Can I use classical ML components alongside quantum circuits? Yes, a hybrid approach is encouraged. The essential requirement is at least one quantum element in your workflow.
Is domain knowledge of HEP required? Not strictly; we’ve provided background so you can focus on the quantum modelling. The main idea is to treat it as a high-dimensional regression task. Additional questions can be asked in the discussion section
How do we install HEP or quantum frameworks locally? • For quantum frameworks: See Qiskit Docs, Pennylane Installation, etc. • No specialised HEP software is strictly required unless you want to explore advanced analyses.