logo

Real-Time Data Anonymization the Serverless Way

2021-10-14

Authors:   Huamin Chen, Yuval Lifshitz


Summary

Real-time data anonymization can be achieved using a Cloud Native Serverless architecture that is lightweight, reliable, and scalable. This architecture uses KEDA and Rook to extend Ceph and support AWS SQS compatible APIs. The queue trigger mechanism does not require exposing external endpoints to Serverless functions, making it more secure. The solution architecture uses Rook and KEDA, and the anonymization service function reads messages from the message queue and detects personal and sensitive information such as faces and license plates and blurs the region of interest using rectangular boxes. Microshift is a lightweight implementation of OpenShifts and Kubernetes that is optimized for large computing use cases that have small factory devices with resource constraints or for environments that only serve single-purpose workloads.
  • Global legislation landscapes on data protection and privacy preservation impact the industry and require careful processing, exchange, or storage of sensitive personal identifiable information
  • Pseudonymization and anonymization are techniques used to address data protection laws, with anonymization being the preferred method as it removes private personal information completely from the original data form
  • Real-time data anonymization can be achieved using a Cloud Native Serverless architecture that is lightweight, reliable, and scalable
  • The architecture uses KEDA and Rook to extend Ceph and support AWS SQS compatible APIs, and the queue trigger mechanism does not require exposing external endpoints to Serverless functions, making it more secure
  • The solution architecture uses Rook and KEDA, and the anonymization service function reads messages from the message queue and detects personal and sensitive information such as faces and license plates and blurs the region of interest using rectangular boxes
  • Microshift is a lightweight implementation of OpenShifts and Kubernetes that is optimized for large computing use cases that have small factory devices with resource constraints or for environments that only serve single-purpose workloads
Data protection and privacy are increasingly important issues for global data controllers. Care must be taken to process, exchange, or store sensitive personal identifiable information and honor privacy preferences. The solution architecture presented in this talk uses a Cloud Native Serverless architecture that is lightweight, reliable, and scalable to ensure real-time data anonymization. This architecture uses KEDA and Rook to extend Ceph and support AWS SQS compatible APIs, and the queue trigger mechanism does not require exposing external endpoints to Serverless functions, making it more secure. The anonymization service function reads messages from the message queue and detects personal and sensitive information such as faces and license plates and blurs the region of interest using rectangular boxes. Microshift is a lightweight implementation of OpenShifts and Kubernetes that is optimized for large computing use cases that have small factory devices with resource constraints or for environments that only serve single-purpose workloads.

Abstract

How do you ensure privacy protection in the far-flung computing workloads that make up many Edge infrastructures? One way is to ensure that personal information is hidden, on the fly, without introducing lag. Seems like a tall order, but it can be done. This talk presents a Cloud Native Serverless architecture to ensure real time data anonymization, using KEDA and Rook. Specifically, we have extended Ceph to support AWS SQS compatible APIs and developed an external Scaler in KEDA to allow Serverless functions to query, pull, and anonymize objects. This architecture is lightweight, reliable, and scalable. More importantly, the queue trigger mechanism in this architecture does not require us to expose external endpoints to Serverless functions that could become additional attack surfaces. This talk will demo an open source Serverless workflow based on the above technologies. It uses object detection AI models to anonymize images that are produced by Edge workloads.

Materials:

Post a comment

Related work