logo
Dates

Author


Conferences

Tags

Sort by:  

Authors: Yang Che, Yuandong Xie
2021-10-14

tldr - powered by Generative AI

Fluid is an open-source project that provides an efficient and convenient data abstraction for data-intensive tasks in the cloud-native field, solving problems in the separation of storage and computing architecture.
  • Data-intensive tasks face problems in the separation of storage and computing architecture, leading to reduced computing efficiency and huge overhead pressure on the underlying storage system.
  • Fluid provides data affinity scheduling, distributed cache engine acceleration, and multi-source data integration data lake.
  • Fluid's data scheduling accelerates a large number of big data and AI workloads in Alibaba Cloud and Tencent Cloud.
  • Fluid's architecture includes two custom resources, a site and a runtime, and two major components, a controller manager and a scheduler.
  • Fluid's site provides a unified interface for accessing data from IDC and the cloud and can accelerate data access through distributed cache.
  • Fluid's scheduler intelligently schedules jobs to catch nodes and notifies the runtime to prefetch data to a specified node.
  • Fluid's demo shows how to use Fluid to accelerate a machine learning training job and provides automatic expansion mechanisms for distributed cache flow.