Building ML Pipelines in JupyterLab Using Elyra - Without the Need to Write Code

Conference: OpenAI + Data Forum 2022

2022-06-21

Authors: Patrick Titzler

Summary

The presentation discusses the development of a library pipeline feature for Jupyter notebooks to enable the creation of machine learning workflows. The feature includes a visual pipeline editor, CLI, and support for three different runtime environments. The goal is to make it easier to break down large notebooks into smaller ones and automate the execution of pipelines in a production environment.

Library pipeline feature for Jupyter notebooks enables the creation of machine learning workflows
Includes a visual pipeline editor, CLI, and support for three different runtime environments
Goal is to make it easier to break down large notebooks into smaller ones and automate the execution of pipelines in a production environment

The speaker explains that breaking down large notebooks into smaller ones is important for reusing assets in a production environment. They also mention the challenge of finding volunteers in the open source community to help with the work, and the need to build up a tough skin when dealing with unreasonable requests from big companies.

Abstract

Whether you are just getting started in Data Science or are seasoned data scientist, JupyterLab is likely a tool you are using frequently to get work done. In this session we will introduce the Elyra visual editor extension to JupyterLab, which allows for the creation of machine learning pipelines from Jupyter notebooks and Python scripts without the need to write any code. In this session we'll demonstrate how to build and run these pipelines on Kubeflow Pipelines or Apache Airflow, and outline how to take advantage of components to perform general purpose or custom tasks.

Materials:

Tags: