Implementation of Privacy by Design and Technical and Organizational Security Measures: The Data Masking Solution
As the European Union is working this year on a revised set of rules for its data protection framework this year, some of them rephrase new security requirements for companies, in particular the duty for all of them to notify the data breaches they experience to the data protection authority or, even, to their customers. We assess here whether and how data masking can be considered an effective data security measure that fulfills not only the current EU Data Protection Directive‘s security requirements, but also limits the likelihood that all companies will have as data controllers to notify their data breaches to the supervising authority and their customers under the Proposal of General Data Protection Regulation. We also cover the concept and principles of “privacy by design” (PbD) that has been incorporated in the draft Regulation in discussion, and assess whether data masking fulfills those PbD principles.
For the last few years, the spate of security breaches that affect individuals’ personal data has caused growing alarm among businesses, government and the population in general. Corporate networks store and process a massive amount of sensitive information and can be subject to leaks, hacking or other security breaches.
The recent Ponemon Institute’s “Cost of a Data Breach” study found the average cost of a data breach to be from €1.39 million in Italy to €3.40 million in Germany, with an average cost per compromised account ranging from €78 in Italy to €146 in Germany. In the event of a breach, costs to your organization will quickly start to mount up. If your company needs to notify patients, clients and employees or subscribe to an insurance specifically covering data breaches, this will most probably require an expensive effort in order to investigate the cause of the breach, and ensure that it can be controlled. Potential litigation may follow, as well as fines from government or other regulatory bodies. There are also intangible costs to take into account: damage to your company’s brand and reputation, loss of customers or decline in value.
Without making security a part of the development process from beginning to end, companies will be under increasing attacks from regulators, shareholders, business partners, customers and business associates. Data masking appears to be one way to comply with that objective.
What is data masking?
From Wikipedia, the definition of data masking is
the process of obscuring specific data elements within data stores. It ensures that sensitive data is replaced with realistic but not real data. The goal is that sensitive customer information is not available outside of the authorized environment. Data masking is typically done while provisioning non-production environments so that copies created to support test and development processes are not exposing sensitive information and thus avoiding risks of leaking.
Data masking is a technique that can at the same time de-identify, anonymize, redact and obfuscate. It provides for the replacement of fictitious but realistic data in test environments and in situations where data must be shared with third parties, and makes those areas with masked data free from any danger of cyber attack, or from potential error or misuse by employees or business associates. While “basic” data masking solutions scramble data or replace text with “X”’s, “advanced” ones provide detection, automation, auditing and integration.
Data masking is not encryption
Encryption is a method for securing communications from unauthorised eavesdropping whereas data masking, on the other hand, is a methodology intended to protect the content of data in non-production environments while it maintains the referential integrity of the original production data. The only purpose here is to protect the data without aiming at reconstructing the original data – which is why data masking should be irreversible, while encryption is not. Any encryption algorithm is reversible while, for masking, reversibility is a weakness. Encryption methods are per se crackable given their reversibility, while adequate data masking is not since there is nothing to crack in the first place: the original data does not exist in any form, and therefore cannot be reproduced.
Is data masking an “appropriate technological protection and organisational measure” under the meaning of the e-Privacy Directive and Future General Data Protection Regulation?
This question is important because it determines whether or not your company will have to notify or not the regulator of a data breach. It is, right now, of the highest interest to Internet service providers or telecommunications companies in countries that have already implemented the amended e-Privacy Directive of 2002 (Directive 2009/136/EC). It will be of interest later, around 2015, to any company processing personal data as a data controller once the General Data Protection Regulation is enacted. Answering this question depends also on whether a breach has occurred or not. Before any breach occurs, you, or the person responsible at your company, upon inquiry or audit from a supervising authority (a data protection or telecommunications regulatory authority), will have to demonstrate that the data masking you have implemented is “appropriate”. This will depend on how masking is implemented into your company’s databases in a way that limits the likelihood of identity fraud or other forms of data misuse of your company’s clients’ or customers’ personal data (see Recital 69, Proposal of General Data Protection Regulation). After a breach, and for your company to avoid having to notify customers of the breach, you would have to demonstrate that the data masking measure had been applied before to the personal data concerned by the breach. In either case, data masking must have rendered the personal data “unintelligible” to any person not authorised to access it (see Art. 4(3), Directive 2009/136/EC). (For more details about the amended e-Privacy Directive (2002/58/EC), you can refer to our previous posts: “ENISA Surveys Stakeholders of Upcoming EU Data Breach Notification Regime” and “European Data Protection Supervisor Supports General Obligation to Report Security Breaches”.) Should you utilize data masking pursuant to these basic guidelines, there should be no need to notify data subjects in the event of a breach of their personal data as the data subjects whose data would be breached, only have fictitious data attached to them. If the breach only concerned those fictitious data, the likelihood of identity fraud, or other form of misuse, would be reduced to zero. The Proposal of General Data Protection Regulation currently uses the same language in articles 31 and 32, indicating that what is mandatory today for telecommunications companies and ISPs is probably going to be the same once the Regulation would enter into force.
Several major data protection policy documents, resolutions and regulations already incorporate the principle of PbD, including the US Federal Trade Commission’s Report “Protecting Consumer Privacy in an Era of Rapid Change – Recommendations for Businesses and Policymakers” of March 2010 (this report calls businesses to implement privacy-by-design measures when collecting and using consumers’ personal data) as well as the European Data Protection Supervisor (EDPS)’s “Opinion (…) on Promoting Trust in the Information Society by Fostering Data Protection and Privacy”, also of March 2010, in which the EDPS argues that privacy-by-design is “a key tool to ensure citizens’ trust in ICTs”. There is also the draft proposal of the EU General Data Protection Regulation, which Article 30 Section 1 provides that appropriate technical and organizational measures should be employed to ensure a level of security that is state-of-the-art and cost effective. Section 3 brings in the concepts of “privacy by design” and “data protection by default”, both of which are concerned with building privacy protection “from the ground on up” and having them permeate all aspects of a company’s operations.
The PbD principle profoundly impacts the ways in which companies currently develop software: their privacy professionals have become crucial actors who participate in the design, development and change process of systems that may impact patients, employees or customers’ sensitive information. To do this, they need to factor in an incremental upfront cost as part of the development cycle. However, this is more than offset once you incorporate the potentially large costs associated with handling potential breaches later on into the equation.
Data masking can fulfill all principles of privacy by design
Examining the “7 foundational principles” of PbD”, we argue that the addition of masking goes a long way toward fulfilling the requirements of privacy by design:
- Proactive, not reactive; preventative, not remedial (“The Privacy by Design approach is characterized by proactive rather than reactive measures. It anticipates and prevents privacy-invasive events before they happen. PbD does not wait for privacy risks to materialize, nor does it offer remedies for resolving privacy infractions once they have occurred – it aims to prevent them from occurring. In short, Privacy by Design comes before-the-fact, not after”). Data masking comes before-the-fact, not after, and provides automated procedures of discovery and auditing in a proactive manner.
- PbD comes by default (“Privacy by Design seeks to deliver the maximum degree of privacy by ensuring that personal data are automatically protected in any given IT system or business practice. If an individual does nothing, their privacy still remains intact. No action is required on the part of the individual to protect their privacy – it is built into the system, by default.”) Data masking ensures that sensitive data are automatically protected and the feature is built by default into the system. Masking also occurs as part of the general population of data to test systems. What it means is that there is a general process to provision data from production systems to test systems. Masking functions as a part of this process. You could think of it as hooking up a water filter to the pipes in your home: you turn on the faucet as you normally do to get water but have it now filtered without having to take any action.
- PbD as privacy embedded into the design (“Privacy by Design is embedded into the design and architecture of IT systems and business practices. (…) It is an essential component of the core functionality being delivered without diminishing functionality.”) The use of masking does not in fact in any way impact the functionality of the system design.
- Full Functionality – Positive-sum, not zero-sum (“Privacy by Design seeks to accommodate all legitimate interests and objectives in a positive-sum “win-win” manner, not through a dated, zero-sum approach, where unnecessary trade-offs are made. Privacy by Design avoids the pretense of false dichotomies, such as privacy vs. security, demonstrating that it is possible to have both.”) Data masking accommodates full system functionality without trading off privacy or functionality for security. The use of masking does not require any lessening of security or functionality as it works on the data that is used and not the software of the system.
- End-to-end security – Lifecycle protection (“Privacy by Design, having been embedded into the system prior to the first element of information being collected, extends throughout the entire lifecycle of the data involved, from start to finish. This ensures that at the end of the process, all data are securely destroyed, in a timely fashion. Thus, PbD ensures cradle to grave, lifecycle management of information, end-to-end.”) Data masking provides protection for sensitive data from the beginning of analysis to completion of testing and implementation. Masked data can be used for much of the analysis as well as completely through the test and approval phases of the software development cycle. Also, utilizing data masking is consistent with a recommendation made in the Verizon Data Breach Report of 2012, which suggests that organizations eliminate unnecessary data in their environments.
- Visibility and transparency (“Privacy by Design seeks to assure all stakeholders that whatever the business practice or technology involved, it is in fact, operating according to the stated promises and objectives, subject to independent verification. Its component parts and operations remain visible and transparent, to users and providers alike. Remember, trust but verify.”) Data masking is visible and transparent, to users, developers and service providers. Reports generated from masking activities are available for all to see and understand the masking events that have taken place.
- Respect for users (“Above all, Privacy by Design requires architects and operators to keep the interests of the individual uppermost by offering such measures as strong privacy defaults, appropriate notice, and empowering user-friendly options. Keep it user-centric.”) Data masking enables personal identities to be restricted to only production environments. In this way, it respects users’ personal privacy.
How does data masking fulfill the proactive requirement of privacy by design?
To implement PbD measures, as we noted above, you need to be proactive and preventative, instead of reactive and remedial. Using the discovery feature of a masking product, which searches databases for sensitive personal data, and coupling it with automatically generating masked data and auditing previously masked databases, illustrates the type of proactive approach a company should use to head off potential data breaches. Also, this approach requires that you protect sensitive data automatically and that you embed security into the design and architecture of systems. The integration with masking into provisioning and scheduling systems, such as “Control-M” or “AppWorx”, provides automatic protection in which data is protected by default.
The use of masking therefore enables you to use the full functionality of a system, from the beginning of the analysis to the completion of the testing and implementation phases, without trading off security for functionality.
Data masking is also very visible and transparent and makes data protection a measure by default that is used throughout data processing.
The environments that we live in and manage are growing increasingly complex while the growth of data is accelerating and new ways to access them are appearing. The legislative environment is also getting more stringent. By making security a part of the development process from beginning to end – and data masking is one way to do it – we think companies are adequately responding to increasing attacks from regulators, shareholders, business partners, customers and business associates about the business risks of data breaches and how to tackle them.