Updated 14/02/2024
Thematic Modules: Physics

3DGAN

Generative Adversarial Networks

Description

A generative adversarial network approach that generates High Energy Physics (HEP) calorimeter output.

Calorimeters are special HEP detectors that record particles through the measurement of the energies deposited by them.

Particle detector simulations are fundamental for interpreting the results of HEP experiments, allow scientists to design detectors, and perform physics analyses. Simulations are traditionally modelled with Monte Carlo (MC) techniques that rely on repeated random sampling, which are inherently slow and very resource-intensive from a computing perspective. 3DGAN is a fast alternative to MC, with remarkable results in terms of speed up. 3DGAN was the first effort where the detector output was generated employing three dimensional convolutions, an approach for retaining correlations in all three spatial dimensions.

Release Notes

The fast particle detector simulation with GAN thematic module consists of two inseparable components as we have already discussed in D7.2 and D7.4. These two components are the machine learning framework and the particle simulation framework. An implementation of the 3DGAN approach has been developed and a more detailed description follows. The 3DGAN component will be integrated into the particle simulation application. The code is available on GitHub and it has been tested and run on a single Linux node using GPU infrastructure.

3DGAN is being trained to produce images similar to the ones that are produced by Monte Carlo simulations. As the calorimeter detectors consist of layers of cells, those cells are modelled as monochromatic pixelated images with the cell energy depositions being the pixel intensities. 3DGAN consists of 2 networks, a generator and a discriminator, the two networks compete with each other trying to optimise a loss function until the convergence point, where the discriminator won’t be able to distinguish the images generated by the generator from the real images. Each network is being trained using 3-dimensional convolution layers to represent the 3 spatial dimensions of the calorimeter images.

The generator network implements stochasticity through a latent vector drawn from a Gaussian distribution. The generator input includes the primary particle’s initial energy and the angle that it entered the detector, concatenated to the latent vector. The generator network then maps the input to a layer of linear neurons followed by 3D convolutional layers. The discriminator input is an image while the network has only 3D convolutional layers. Batch normalisation is performed between the layers and the LeakyRelu[1] activation function is used for the discriminator layers while the Relu13 activation function is used for the generator layers. The model’s loss function is the weighted sum of individual losses concerning the discriminator outputs and domain-related constraints, which are essential to achieve high-level agreement over the very large dynamic range of the image pixel intensity distribution in a HEP task.  The training of this model’s legacy versions was inspired by the concept of transfer learning. Meaning that the 3DGAN was trained first for images in a limited energy range and after the GAN converged, the same trained model was further trained with the data from the whole available energy range. The model reported here is trained on images from a limited energy range of 100-200 GeV.

Currently, the 3DGAN training workflow consists of several other processes, the data pre-processing process, the model definition, and training process. The validation and hyperparameter optimization processes are under research.

The dataset used for studying and developing the 3DGAN model (Fast simulation of a high granularity calorimeter by generative adversarial networks. Khattak, G.R., Vallecora, S., Carminati, F. et al. Eur. Phys. J. C 82, 386 (2022). DOI: https://doi.org/10.1140/epjc/s10052-022-10258-4 ) (public dataset) consists of calorimeter 3D images/arrays of energy depositions with shape 51x51x25, which represent the particle showers. These images were created from simulations performed with Geant4 software. The output of the Geant4 simulation is ROOT[2] files, which need to be converted into a ML-friendly format HDF5 in order to train the model.

The preprocessing is responsible for preparing (cleaning, scaling, etc.) and converting into a suitable format (HDF5 format) the simulated data created by Geant4 (ROOT format). It also encodes the input information such as the calorimeter’s geometry identifier, the energy of the primary particle initiating the shower, the angle at which the particle enters the detector, and also its type and/or initial position. The pre-processed data are then passed to the GAN model (currently developed using Tensorflow v1 and v2[3] , as well as in PyTorch Lightning[4]) for training. The hyperparameter optimization (HPO) tuning processes will be used for searching for the best set of model hyperparameters (e.g. AutoML[5], Optuna[6] etc.). The validation process will verify the performance through a set of physics-motivated steps, both at single image quality level and at the sample level. Finally, the model will be converted into ONNX[7] format and used for inference within the Geant4 application.

During pre-processing, simulation inputs are defined and encoded, i.e. the detector geometry, the energy and angle of the incoming particle. The performance of the model will be evaluated during validation processes through the creation of histograms describing particle shower observables. Shower observables are among others, total energy distribution (sum of all cell energy deposits), cell energy deposits distribution, longitudinal profile which represents the energy deposited by a shower as a function of the depth of the calorimeter and lateral profile which represents the energy density distribution as function of the radius of the calorimeter. Moreover, the physics-based validation process will include accuracy verification of those key distributions’ first moments and precise evaluation of the tails of distributions that usually require larger amounts of samples. The original data coming from Geant4 and the 3DGAN data distributions will be compared during this evaluation process. At inference time, a secondary validation will be performed by the Geant4 application to ensure that the fast simulation is accurate after mapping the inferred energies to positions in the calorimeter.

Concerning the particle simulation framework component of our thematic module, there have been testbeds developed that are incorporating different ML models than the 3DGAN. Therefore, our future efforts will focus on integrating the 3DGAN model in the simulation framework that uses the Geant4 environment. An example of the use of ML techniques for the fast detector simulation and how to incorporate inference libraries into Geant4 is the Par04 example developed by the Geant4 community and can be found on CERN Gitlab[8]. The ML model used in this example is a Variational Autoencoder (VAE), trained externally in Python on full Geant4 detector simulation response data.

[1] https://en.wikipedia.org/wiki/Rectifier_(neural_networks)

[2] ROOT: https://root.cern/

[3] Tensorflow: https://www.tensorflow.org/

[4] PyTorch Lightning: https://lightning.ai/

[5] AutoML: https://www.automl.org/automl/

[6] Optuna: https://optuna.org/

[7] ONNX: https://onnx.ai/

[8] https://gitlab.cern.ch/geant4/geant4/-/tree/master/examples/extended/parameterisations/Par04

Future Plans

The development of the thematic module will continue, focusing on aspects such as parallel model training, hyperparameter optimization and model validation. Studies will be conducted on existing solutions for parallel training and hyperparameter optimization that will lead to the selection and implementation of the best solution based on the specific use case. In collaboration with the HEP community, different validation techniques will be studied with the goal to identify the technique most aligned to the needs of the specific fast detector simulation use case.

  •  Moreover, we’ll continue with the integration of the 3DGAN module with the other DTE modules, such as the AI workflow framework, the federated computing framework, the workflow execution framework and more.

 

 

Target Audience
+
  • DT operators/developers
  • Physics experts
  • Data Scientists
Documentation
+
License
+

MIT

Created by
+