Fluid is an open-source project that provides an efficient and convenient data abstraction for data-intensive tasks in the cloud-native field, solving problems in the separation of storage and computing architecture.
- Data-intensive tasks face problems in the separation of storage and computing architecture, leading to reduced computing efficiency and huge overhead pressure on the underlying storage system.
- Fluid provides data affinity scheduling, distributed cache engine acceleration, and multi-source data integration data lake.
- Fluid's data scheduling accelerates a large number of big data and AI workloads in Alibaba Cloud and Tencent Cloud.
- Fluid's architecture includes two custom resources, a site and a runtime, and two major components, a controller manager and a scheduler.
- Fluid's site provides a unified interface for accessing data from IDC and the cloud and can accelerate data access through distributed cache.
- Fluid's scheduler intelligently schedules jobs to catch nodes and notifies the runtime to prefetch data to a specified node.
- Fluid's demo shows how to use Fluid to accelerate a machine learning training job and provides automatic expansion mechanisms for distributed cache flow.