logo
Dates

Author


Conferences

Tags

Sort by:  

Authors: Holden Karau
2022-05-18

tldr - powered by Generative AI

The presentation discusses the challenges of working with big data matrices and how Apache Spark, Apache Mahout, Kubeflow, and Kubernetes can be used together to solve these challenges.
  • Kubernetes allows for elastic scaling but has limitations when it comes to fitting large matrices in memory
  • Apache Spark and Mahout can distribute matrices across an unbounded number of pods/nodes
  • Kubeflow can be used to make the process easily reproducible
  • The presentation provides an anecdote about using these tools to denoise DICOM images of lungs of COVID patients