logo
Dates

Author


Conferences

Tags

Sort by:  

Authors: Chin Huang, Ted Chang
2022-06-23

tldr - powered by Generative AI

Overview of K-Serve with Model Mesh and demo of model inference using online features
  • K-Serve is a standards-based model serving platform built on top of Kubernetes
  • Model Mesh in K-Serve is designed to address Kubernetes' resource limitations and allows for high density and scalability
  • Model Mesh architecture includes serving runtime deployments, containers for model mesh logic, adapters for retrieving models, and model servers for inference
  • Scalability test showed that 20k simple stream models could be deployed into two serving runtime pods in a small Kubernetes cluster
  • Demo showed integration of open source model mesh model serving layer with Feast for multi-region model serving in a Kubernetes cluster