logo

Panel Discussion: How Open Source Helps DataOps

2022-06-22

Authors:   Cheranellore Vasudevan, Mandy Chessell, David Radley, Dan Wolfson


Summary

The presentation discusses the importance of open source tools and integration in data operations (DataOps) and how it can promote democratization of data while ensuring security. The focus is on the Algeria and Open Lineage projects as examples of open source tools that can be used to achieve this goal.
  • Open source tools and integration are crucial in promoting democratization of data while ensuring security in DataOps.
  • Algeria and Open Lineage are examples of open source tools that can be used to achieve this goal.
  • Algeria operates in a peer-to-peer way, allowing each silo to invest in their own tools and choose what they share and what they keep secure.
  • The ease of validation and familiarity of open source tools can help build trust in a particular solution across different parts of the organization.
  • Joining the open source community and contributing to the projects can help promote integration and collaboration in DataOps.
The speaker mentions that Algeria and Open Lineage are examples of open source tools that can be used to promote democratization of data while ensuring security. Algeria operates in a peer-to-peer way, allowing each silo to invest in their own tools and choose what they share and what they keep secure. This approach has been successful in working with different companies in different industries with varying security requirements. The ease of validation and familiarity of open source tools can help build trust in a particular solution across different parts of the organization.

Abstract

DataOps is a set of practices that aims to deliver trusted and business-ready data to accelerate the journey to build AI-powered applications. The DataOps Committee in LF AI & Data is is a global group that consists of participants from various geographies focussing on: Identify Projects and tools in DataOps Space and get the community exposed to how these DataOps tools work together and where to use in the pipeline (with pros and cons). Exposure to industrial approaches for dataset metadata management, governance, and automation of flow. Understand usage of DataOps tools and practices through industrial use cases (by domain). Identify gaps in the use case implementation and discuss solutions to bridge the gap. Exposure to tools and technologies that can help control the usage of data and securely access it across the enterprise in a cloud-native platform. Provide an opportunity for committee members to perform research in the DataOps space. Educate the community about new developments in the DataOps space. The panelists are available to answer questions on this mission and how open source projects including Egeria and OpenLineage can support these practises

Materials: