The presentation discusses the use of Kubernetes in high energy physics data analysis, specifically for batch processing and interactive analysis facilities.
- Kubernetes is used for batch processing in high energy physics data analysis, allowing for scaling up to hundreds of thousands of cores with minimal failure rates.
- Kubernetes also enables the use of heterogeneous architectures, such as ARM and GPU resources, for data analysis.
- Interactive analysis facilities using Jupiter and Dask are also implemented using Kubernetes, allowing for dynamic scaling of resources.
- The presentation includes anecdotes of successful use of Kubernetes in simulating events on ARM resources and scaling up task clusters for faster data analysis.
One example of successful use of Kubernetes in high energy physics data analysis is the simulation of events on ARM resources. While many sites were interested in purchasing ARM resources, no one wanted to be the first to do so. To address this, the team set up an EKS cluster with Graviton 2 nodes and used multi-arc Docker images to generate different versions of the image based on the architecture of the client. This allowed for the first 10,000 events ever simulated on ARM to be generated and compared to events on x86 to ensure proper alignment.