logo
Dates

Author


Conferences

Tags

Sort by:  

Authors: Erik Jacobs
2022-10-28

tldr - powered by Generative AI

The presentation discusses the use of Kubernetes for running HPC workloads, specifically using OpenFOAM as an example. The speaker emphasizes the importance of tuning and optimizing the instance types and pods used for the job. They also mention potential future developments, such as using Nvidia GPUs and exploring new schedulers.
  • Kubernetes can be used for running HPC workloads, but tuning and optimization are crucial
  • OpenFOAM was used as an example of an MPI job that can be run on Kubernetes
  • Future developments include using Nvidia GPUs and exploring new schedulers
Authors: Ricardo Rocha
2022-05-19

tldr - powered by Generative AI

The presentation discusses the challenges of implementing cloud native and high performance computing (HPC) and how recent work is bridging the gap between the two.
  • Cloud native and Kubernetes have become popular in modern IT deployments, but challenges remain in areas where HPC can have a larger impact.
  • HPC involves aggregating computing power to deliver higher performance for solving large problems in science, engineering, and business.
  • HPC deployments require low latency, high throughput, and numeral awareness, which are not common in most deployments.
  • Advanced scheduling is also important for HPC deployments with millions of jobs and users with different software needs.
  • The speaker shares an anecdote about CERN's experience with transitioning to Kubernetes for their HPC needs.
  • High throughput computing is a similar paradigm to HPC, but focuses on the efficient execution of a large number of loosely coupled tasks.
  • The speaker highlights the similarities between high throughput computing and cloud native systems.
Authors: Abdullah Gharaibeh, Aldo Culquicondor, Alex Wang
2022-05-19

The Kubernetes Working Group Batch was newly formed in the beginning of 2022. The Working Group aims to be a forum to discuss and propose enhancements to support for Batch (eg. HPC, AI/ML, data analytics, CI) workloads in core Kubernetes. We want to unify the way users deploy batch workloads to improve portability and to simplify supportability for Kubernetes providers. In this session, you will learn about the WG goals and roadmap , as well as the early efforts performed by our contributors.Click here to view captioning/translation in the MeetingPlay platform!
Authors: Claudia Misale, Daniel Milroy
2022-05-18

tldr - powered by Generative AI

The presentation discusses the potential benefits of converged computing, which combines cloud and high-performance computing (HPC) technologies, and the challenges in achieving fully featured HPC scheduling in Kubernetes.
  • Converged computing combines cloud and HPC technologies to enhance application performance, scalability, flexibility, and automation.
  • Fully featured HPC scheduling in Kubernetes has not yet been achieved, and there are challenges in co-scheduling, throughput, job communication and coordination, portability, and resource heterogeneity.
  • The Flux framework is an open-source project that solves the five key technical problems of converged computing.
  • Cloud computing is becoming a dominant market force, and HPC needs to integrate research and development in software and hardware to avoid becoming isolated.
  • LLNL is seeing demand for cloud technologies within HPC workflows, and there is potential to unite the two communities in a converged computing environment.
Authors: Trey Dockendorf
2022-05-17

tldr - powered by Generative AI

The presentation discusses the use of Kubernetes for interactive HPC jobs and the implementation of Kyverno for secure multi-user access.
  • Ohio Supercomputer Center uses Open OnDemand and Kubernetes for virtual classrooms running RStudio Server and Jupyter
  • Challenges include shared file system access and ensuring user processes run with correct uid and gid
  • Design patterns include user pods in namespaces with user prefix and access control roles
  • Kyverno policies ensure uid and gid match user's LDAP record, restrict host path access, disallow privilege escalation, and enforce max resource requests and runtime
  • An anecdote is not provided in the presentation