Fugue is an open-source abstraction layer that allows users to port native Python code to Spark or Dask with minimal code changes, making data science code framework-agnostic and scale-agnostic.
- Data scientists often find themselves reimplementing the same code to transition from Pandas to Spark when data grows too large for Pandas to handle
- Fugue solves this problem by providing an abstraction layer that allows users to port native Python code to Spark or Dask with minimal code changes
- Fugue makes data science code framework-agnostic and scale-agnostic, allowing it to be ported to different execution environments
- Fugue was demonstrated by showing how to scale data compute from a single machine to a Spark cluster set-up on Kubernetes