Data Loss Prevention Deidentification with ABAP SDK for Google Cloud

ameyasapcloud_0-1698348121449.png

What is Data Loss Prevention API

The Data Loss Prevention (DLP) API provided by Google Cloud Platform offers the following three operations for protecting sensitive data:

  • Redaction: Redaction is the process of obscuring or removing sensitive data from text, images, or other types of content. When you redact sensitive data, you are making it difficult or impossible for someone to identify the individuals in the data. The DLP API supports several redaction techniques, including masking, tokenization, and encryption.
  • Deidentification: Deidentification is the process of modifying sensitive data so that it is no longer personally identifiable. This can be done by replacing names, dates, or other identifying information with pseudonyms or generalizations. Deidentification makes it difficult to link the data to specific individuals, but it does not guarantee that the data is anonymous. The DLP API supports several deidentification techniques, including pseudonymization, generalization, and suppression.
  • Anonymization: Anonymization is the process of transforming sensitive data so that it is no longer possible to identify the individuals in the data. This can be done by using techniques such as k-anonymity, differential privacy, or homomorphic encryption. Anonymization is the strongest form of privacy protection, but it can also be the most difficult to achieve. The DLP API does not directly support anonymization, but you can use it to implement your own anonymization logic.

 In the context of the DLP API, the following table summarizes the key differences between redaction, deidentification, and anonymization:

ameyasapcloud_1-1698348177877.png

The choice of which operation to use depends on the specific business requirements. If you need to protect sensitive data from unauthorized access, redaction may be sufficient. However, if you need to comply with privacy regulations, such as GDPR or HIPAA, you may need to deidentify or anonymize the data.

The DLP API provides a variety of features to help you choose the right operation for your needs. You can use the DLP API to:

  • Identify sensitive data: The DLP API can help you identify sensitive data in your datasets. The API includes a library of built-in detectors for common types of sensitive data, such as credit card numbers, Social Security numbers, and personally identifiable information (PII). You can also create your own custom detectors for specific types of sensitive data.
  • Classify sensitive data: Once you have identified sensitive data, you can classify it according to its sensitivity level. This will help you determine the appropriate privacy protection measures to apply.
  • Protect sensitive data: The DLP API provides a variety of techniques for protecting sensitive data, including redaction, deidentification, and anonymization. You can choose the right technique for your needs based on the sensitivity level of the data and the privacy requirements of your application.
  • Monitor and audit sensitive data: The DLP API can help you monitor and audit sensitive data. This will help you ensure that the data is being protected correctly.

The DLP API is a powerful tool for protecting sensitive data. By using the DLP API, you can comply with privacy regulations, protect your customers’ privacy, and reduce the risk of data breaches. Please refer to the code sample for some quick references.

Below is a quick start of consuming DLP API using ABAP SDK for Google Cloud to execute an Email ID Deidentification scenario.

The configuration steps in this quickstart guide assumes that the SAP system is hosted on Google cloud platform.

To learn more about authentication step for SAP system hosted outside Google Cloud Platform, please refer to the documentation “Authenticate using tokens for SAP hosted outside Google Cloud

Before you begin

Before you run this quickstart, make sure that you or your administrators have completed the following prerequisites:

Enable required services

  • Click Activate Cloud Shell at the top of the Google Cloud console to Open Cloud Shell. We will use the Cloud Shell to run all our commands.
  • Enable Google Service to be accessed by ABAP SDK (Replace the string PROJECT_ID with your Google Cloud project Id)
gcloud auth login
gcloud config set project PROJECT_ID
gcloud services enable iamcredentials.googleapis.com
gcloud services enable dlp.googleapis.com

Configure client key for DLP Access

The below configuration will be used by the ABAP SDK to connect to the secret manager API.

  • Goto SPRO > ABAP SDK for Google Cloud > Basic Settings > Configure Client Key and add the following new entry. (Replace the string PROJECT_ID with your Google Cloud project Id)

Google Cloud Key Name:DEMO_DLP

Google Cloud Service Account Name: abap-sdk-qs@PROJECT_ID.iam.gserviceaccount.com

Google Cloud Scope:https://www.googleapis.com/auth/cloud-platform

Google Cloud Project Identifier:PROJECT_ID

Authorization Class:/GOOG/CL_AUTH_GOOGLE

NOTE Leave the other fields blank

  • Validate the configuration ‘DEMO_DLP’ using SPRO > ABAP SDK for Google Cloud > Utilities > Validate Authentication Configuration.
ameyasapcloud_3-1698348315539.png

Create a program for an example Deidentification scenario

  • Create a program in SE38 and paste the below code (Github Repo), which deidentifies the email id from the text and replaces it with a generic string.
  • Note: The Client key used in the below program is DEMO_DLP which will be used by the SDK to connect to the API.
REPORT zr_qs_dlp_deidentify.
" data declarations
DATA:
lv_p_projects_id TYPE string,
ls_input TYPE /goog/cl_dlp_v2=>ty_055,
ls_transformations TYPE /goog/cl_dlp_v2=>ty_100.
TRY.
" instantiate api client stub
DATA(lo_dlp) = NEW /goog/cl_dlp_v2( iv_key_name = 'DLP_V2' ).
" pass the sample text for deidentification
lv_p_projects_id = lo_dlp->gv_project_id.
INSERT VALUE #( name = 'EMAIL_ADDRESS' ) INTO TABLE ls_input-inspect_config-info_types.
ls_transformations-primitive_transformation-replace_config-new_value-string_value = '[EMAIL_ID]'.
INSERT ls_transformations INTO TABLE ls_input-deidentify_config-info_type_transformations-transformations.
ls_input-item-value = 'The Email ID of Mr. Foo is foobar@example.com'.
" call the api method to deidentify
CALL METHOD lo_dlp->deidentify_content
EXPORTING
iv_p_projects_id = lv_p_projects_id
is_input = ls_input
IMPORTING
es_output = DATA(ls_output)
ev_ret_code = DATA(lv_ret_code)
ev_err_text = DATA(lv_err_text)
es_err_resp = DATA(ls_err_resp).
IF lo_dlp->is_success( lv_ret_code ).
WRITE: / 'Deidentification Successful'.
WRITE: / 'The replaced text is: ', ls_output-item-value.
ELSE.
MESSAGE lv_err_text TYPE 'E'.
ENDIF.
" close the http connection
lo_dlp->close( ).
CATCH /goog/cx_sdk INTO DATA(lo_exception).
" write code here to handle exceptions
MESSAGE lo_exception->get_text( ) TYPE 'E'.
ENDTRY.Sales Order Header Text Example

Lets see DLP in action! Below is an example where the DLP API was used to deidentify personally identifiable information (PII) from Sales Order Header Text, in case the user enters the same.

Clean Up

For clean up disable the service to avoid any usage.

gcloud services disable dlp.googleapis.com --force

Conclusion and Next Steps

Hope the article was able to give you a quick insight on using Data Los Prevention API with ABAP SDK for Google Cloud.

Ready to start using ABAP SDK for Google Cloud?

Bookmark What’s new with the ABAP SDK for Google Cloud for the latest announcements and follow installation and configuration instructions.

Check out these blog posts to get started with ABAP SDK for Google Cloud

  • This blog, explains how you can evaluate ABAP SDK for Google Cloud using ABAP Platform Trial 1909 on Google Cloud Platform.
  • Read this blog post to get a sneak peek on how a business process such as Sales Order entry in SAP can be automated using ABAP SDK for Google Cloud.
  • This blog is an excellent start to understand how BigQuery ML which is a powerful machine learning service that lets you build and deploy models using SQL queries. you can now be accessed with ABAP SDK for Google Cloud.
  • Read this blog post to understand how to use Secret Manager with ABAP SDK.
  • Also check out blog post about ABAP SDK Code Wizard, and on Application logging as some of the many Engineering excellence delivered as part of ABAP SDK.

Happy Learning! and Happy Innovating!

 
 
Version history
Last update:
‎10-26-2023 12:31 PM
Updated by: