Enhancing Secure Collaboration with Google Drive Data Loss Prevention (DLP)

Data Engineering Jun 14, 2024

In today's digital landscape, safeguarding sensitive data while enabling seamless collaboration across organizations is important. As a leading healthtech platform in Indonesia, Halodoc understands this challenge. Implementing Google Drive Data Loss Prevention (DLP) has been key in achieving this balance. By leveraging the capabilities of Google Drive DLP, we have enhanced our data protection measures to ensure that confidential information remains secure throughout its lifecycle. In this blog post, we will walk you through our strategy in implementing Google Drive DLP, which has helped us proactively mitigate potential data breaches in our organization.

Overview of Google Drive DLP

For many organizations, Google Drive has become the go-to platform for seamless collaboration. Its intuitive interface and cloud-based accessibility allow teams to share documents, spreadsheets, and presentations in real-time, fostering a dynamic work environment. However, with the ease of sharing information comes the inherent risk for unintended data leaks. Accidental sharing of documents containing sensitive information can occur, highlighting the needs for improving the data security measures. This is where Google Drive's native Data Loss Prevention (DLP) capabilities step in. DLP acts as a safeguard, ensuring that sensitive data within the organization remains protected throughout its lifecycle. By leveraging these built-in features, organizations can define rules to identify and protect confidential information for minimizing accidental data leaks or unauthorized access of Google Drive Data.

Why Google Drive DLP Important for Secure Collaboration

While Google Drive offers seamless collaboration, it can also store a wealth of sensitive information that needs to be protected. Here’s why implementing Google Drive Data Loss Prevention (DLP) is essential:

  • Automatic Labeling: Drive DLP automatically labels data based on its classification. This feature is important because it empowers users to easily identify and manage sensitive data with greater care and attention. For example, if a team member uploads a document containing sensitive information, DLP can automatically label and classify the document as confidential.
  • Prevent Accidental Leaks:  Drive DLP helps safeguard sensitive data by scanning files and taking action based on predefined rules. This allows administrators to establish stricter access controls, such as preventing end users from sharing sensitive documents to the external domains.
  • Compliance: The implementation of Drive DLP will also contribute to meeting our legal obligations under the Personal Data Protection (PDP) Law for safeguarding personal information, particularly within Google Drive.

How does DLP work on Google Drive?

Google Drive DLP works through a combination of predefined rules, automatic scanning, and configurable actions. Here is a simplified flow and the breakdown on how Drive DLP works:

  • Defining DLP Rules: Administrators set up DLP rules to identify types of content considered sensitive. These can include identity card numbers, insurance card numbers, medical records, or other data specific to the organization. We can use keywords, regular expressions, or patterns to detect sensitive data for defining DLP rules that can be applied to both My Drive and Shared drives.
  • Automated Labeling: Based on the DLP rules, detectors identify sensitive content on the Google Drive documents and labels will be automatically applied after scan completed.
  • Incident Detection and Actions: If the scan detects content violating DLP rules, it triggers a DLP incident. This incident can be configured to take various actions based on the severity and nature of the violation. These actions include:
    - Alerts: DLP sends alerts to administrators or users who uploaded the files, notifying them of sensitive content and potential policy violations.
    - Blocking Actions: DLP can prevent the sharing or downloading of files that contain the sensitive information.

Halodoc Strategy for Implementing Google Drive DLP

At Halodoc, we have divided our Google Drive DLP implementation into two phases to validate the effectiveness of our DLP rules and processes. This approach helps us address any issues during monitoring and ensures effective enforcement during the transition from monitoring to blocking mode activation.

  • Phase I - Monitoring Mode: We initially roll out DLP features in monitoring mode. During this phase, we assess our predefined DLP rules for identifying sensitive information and gather feedback to identify legitimate scenarios where sending sensitive information to external domains is necessary. This feedback assists us in creating whitelists for legitimate users.
  • Phase II - Blocking Mode: In the second phase, we activate blocking mode. Data labeled as "Confidential" is prohibited from being shared with external domains unless users are whitelisted. Any exception requests require written approval from the respective line manager.

Furthermore, these are the technical details of the step-by-step implementation :

Access Google Workspace:

1.   Log in to your Google Workspace administrator account - Link , do take note that to create and set DLP rules in Google Drive, you must be a super administrator or a delegated admin with certain privileges. For more details, see here - Link

Create a New Label:

2.   Find the Security section and navigate to Access and Data Control.

3.   Select Data Classification, then click Manage.

4.   Click Edit selections, then click Open Label Manager.

5.   Click New Label to create a label

6.   For the new label, choose Badged label.

7.   Fill in the Label name, and optionally, the description. For the setting When copying files, select Always copy label. You can see the results in the Preview section. If all is correct, click Publish.

Create a Detection:

8.   Go to the Security menu section and navigate to Access and Data Control.

9.   Click Data Protection.

10.  Click Manage Detectors, then click Add Detector.

11.  Choose to create a detector using Regular Expression (Regex).

12.  Fill in the Detector Name, and optionally, the Description. Then, fill in the Regular Expression (Regex) to detect the KTP number. You can also test the Regex by clicking Test Expression.

Create a New Rule for Detection:

13.  Return to the Data Protection menu and click Manage Rules.

14.  Click Add Rule and choose New Rule to start creating a DLP rule.

15.  Fill in the Rule Name and select the scope to which this rule applies. In the Scope section, you can apply this rule to users by selecting Include Organizational Unit or Include Group.

16.  You can also whitelist, if necessary, by selecting Exclude Organizational Unit or Exclude Group. Whitelisting applies to Google Drive and shared drive email addresses; file names cannot be whitelisted. Click Continue.

17.  Select the Drive files in Google Drive where you want to protect the data. Then click Continue.

18.  In the conditions section, you can select All content and choose Matches the Regular Expression, find your detector (KTP), and provide a value for the minimum number of detected patterns. For example, we want to detect at least 2 KTP in 1 file.  Then click Continue.

Create Automatic Labeling:

19.  In the Actions menu, choose Apply Drive Labels and select the previously created label.

20.  For user changes, select Don’t Allow. Re-apply rules, labels, and field values if the user changes them.

Create a New Rule to Block Sharing to External Domains:

21.  Return to the Data Protection menu and click Manage Rules.

22.  Click Add Rule and choose New Rule to start creating a DLP rule.

23.  Fill in the Rule Name and select the scope to which this rule applies. In the Scope section, you can apply this rule to users by selecting Include Organizational Unit or Include Group.

24.  You can also whitelist, if necessary, by selecting Exclude Organizational Unit or Exclude Group. Whitelisting applies to Google Drive and shared drive email addresses; file names cannot be whitelisted. Click Continue.

25.  Select the Drive files in Google Drive where you want to protect the data.

26.  In the Conditions menu, choose Apply Drive Labels and select the previously created label.

27.  To block sharing of confidential files to external parties, in the label section choose block external sharing. Then in alerting click Add Recipients to get email notification.

28.  See the following example of the email notification that was sent to the designated recipients that we have configured in the previous step.

Below is the result of the automatic label and detection that has been made, the label name will appear in the file as circled in red:

And this is the result of the notification when the user is trying to share confidential data to external parties:

Conclusion

In conclusion, the implementation of Google Drive Data Loss Prevention (DLP) has been instrumental for Halodoc in enhancing secure collaboration while protecting sensitive data. By leveraging Google Drive native DLP capabilities, we have strengthened our data protection measures, ensuring that confidential information remains secure throughout its lifecycle. This implementation has allowed us to automate the detection and classification of sensitive data in Google Drive, reducing the risk of accidental exposure and unauthorized access. As a result, our teams can achieve the balance between safeguarding sensitive data and enabling seamless collaboration across the organization.

Bug Bounty

Got what it takes to hack? Feel free to report a vulnerability in our assets and get yourself a reward through our bug bounty program. Find more details about policy and guidelines at https://www.halodoc.com/security

Join us

Scalability, reliability and maintainability are the three pillars that govern what we build at Halodoc Tech. We are actively looking for engineers at all levels and  if solving hard problems with challenging requirements is your forte, please reach out to us with your resumé at careers.india@halodoc.com.

About Halodoc

Halodoc is the number 1 all around Healthcare application in Indonesia. Our mission is to simplify and bring quality healthcare across Indonesia, from Sabang to Merauke. We connect 20,000+ doctors with patients in need through our Tele-consultation service. We partner with 3500+ pharmacies in 100+ cities to bring medicine to your doorstep. We've also partnered with Indonesia's largest lab provider to provide lab home services, and to top it off we have recently launched a premium appointment service that partners with 500+ hospitals that allow patients to book a doctor appointment inside our application. We are extremely fortunate to be trusted by our investors, such as the Bill & Melinda Gates Foundation, Singtel, UOB Ventures, Allianz, GoJek, Astra, Temasek, and many more. We recently closed our Series D round and In total have raised around USD$100+ million for our mission. Our team works tirelessly to make sure that we create the best healthcare solution personalized for all of our patient's needs, and are continuously on a path to simplify healthcare for Indonesia.