A Definition of Data Classification

Data classification is the process of labeling data according to its type, sensitivity, and business value so that informed choices can be made about how it is managed, protected, and shared, both within and outside your organization.

Every day businesses are creating more and more data. Data gets saved, employees move on, data is forgotten, and lost. Valuable information sits on your file servers and document stores, not protected and unrecoverable because no-one knows it exists let alone where to find it.

By organizing data into categories, organizations have more control, making data easier to locate and retrieve, which is of particular importance when it comes to risk management, compliance, and data security.

Data classification benefits your business by:

  • Identifying and protecting data wherever it is located
  • Aiding compliance with global data protection regulations including, but not limited to, GDPR, CCPA, HIPAA, CMMC, ITAR, CUI
  • Driving downstream data security solutions such as DRM, DLP, Encryption, and more

Download “Top Reasons for Data Classification” to learn the benefits of making data classification the foundation of your organizations data security program.

Download Solution Brief

Data protection image with padlock
Data security icon

The Need for Data Classification

In a world where data loss is costing organizations millions of dollars in fines, there is not one organization who shouldn’t have data protection at the top of their agenda. Your organization simply cannot afford to not protect the sensitive data it is creating and storing, and therefore, data classification is an essential part of your organization’s data protection strategy.

The main (but not the only) reason that organizations look at classifying the data they create and handle is to ensure that sensitive information can be controlled. A big part of designing a classification policy is understanding what data is sensitive, what is less so, and what is not. Who should have access to this information, and whether you should be holding that information, archiving or deleting it.

You can only adequately protect the data that you know you have, and that’s why data classification matters. It provides context for reporting, triggering the right policies at the right time to keep your organization from facing the ultimate risk, a data breach.

Read our white paper “5 Reasons Classification is the First Step to Successful Data Loss Prevention” to learn how your organization can kick-start a successful DLP project beginning with data classification.

Download White Paper

What are the 4 Common Levels of Data Classification

Depending on its level of sensitivity or value to the organization, the type of classification given to data determines a number of things, including who has access to that data and how long it should be retained. Typically, there should be four base levels when it comes to initially categorizing data:

  • Public – Data/information that is freely used, reused, and redistributed with no restrictions on access or usage. Examples can include press releases, brochures, and published research.
  • Internal – Data that is strictly accessible to internal employees/personnel who are granted access. Examples can include company memos, internal communications, and marketing research.
  • Confidential – Data that requires granted access and/or authorization and should be contained within the business or specifically permissible third-parties. Examples can include PII, and IP.
  • Restricted – Data that is highly sensitive with use limited on a need-to-know basis. If compromised or accessed without clearance, this could result in criminal charges, heavy legal fines, and irreparable company damage. Examples can include trade secrets, PII, health information, and data protected by federal regulations.

While these base categories might offer a place to start in your data classification journey, it’s highly likely it won’t be where you end up. There are a number of reasons why normal business practices within organizations will typically require a greater level of depth to the way data is classified to conform with data security policy, for example:

  • Broad classification labels will invariably need to be broken down into sub categories – ‘Confidential – PII,’ ‘Confidential – Financial’ ‘Confidential – IP’ etc.
  • New global data protection regulations.
  • Retention and reporting requirements vary when business operate in multiple jurisdictions.
  • Merger and acquisition activity demands changes in business structure.
  • Diversity in business operations drives a need for policy changes (e.g., new product, service, division etc.).
  • The ability to support changes in supply chain and business processes, for example, interoperability with partner classification schemes.
  • Ensuring adaptability for different end-user communities, for example, based on skill set or granted access.
  • Support interoperability with new systems and toolsets.

Data Sensitivity Levels

While we’ve looked at mapping data out by type, you should also look to segment your organization’s data in terms of the level of sensitivity – high, moderate, or low.
  • High sensitivity data (Confidential) – data that if compromised or destroyed would be expected to have a severe or catastrophic effect on organizational operations, assets, or individuals. Examples can include financial data, medical records, and intellectual property.
  • Moderate sensitivity data (Restricted) – data that if compromised or destroyed would be expected to have a serious effect on organizational operations, assets, or individuals. Examples can include unpublished research results, information strictly for internal use, and operational documents.
  • Low sensitivity data (Public) – data that if compromised or destroyed would be expected to have a limited effect on organizational operations, assets, or individuals. Examples can include press releases, job advertisements, and published research.

Types of Data Classification

Data classification involves the use of tags and labels to define the data type, its confidentiality, and its integrity. There are three main types of data classification that are considered the industry standard:
  • Content-based classification – inspects and interprets files, looking for sensitive information.
  • Context-based classification – looks to the application, location, metadata, or creator (among other variables) as indirect indicators of sensitive information.
  • User-based classification – requires a manual, end-user selection for each document. User-based classification takes advantage of the user knowledge of the sensitivity of the document, and can be applied or updated upon creation, edit, review, or dissemination.

Data Classification Further Examples

The following shows common examples of organizational data which may be classified into each sensitivity level:
  • Personally identifiable information (PII)
  • Credit card details (PCI)
  • Intellectual property (IP)
  • Protected healthcare information (including HIPAA regulated data)
  • Financial information
  • Employee records
  • ITAR materials
  • Internal correspondence including confidential data
  • Student education records
  • Unpublished research data
  • Operational data
  • Information security information
  • Supplier contact information
  • Internal correspondence not containing confidential data
  • Public websites
  • Public directory data
  • Publicly available research
  • Press releases
  • Job advertisements
  • Marketing materials

Data Classification and Compliance

Data classification and compliance

The amount of data being generated by organizations globally is at an all-time high, a path which shows every sign of increasing still further. And while businesses generate, store, and share more and more data, and more types of data from multiple and disparate sources, they’re also confronted with the threat of having to protect all that information from non-authorized outsiders, accidental loss and internal bad actors, all while complying with the increasing amount of data protection regulations worldwide.

To adequately protect your organization’s information, you first need to identify what needs protecting. And you can’t do that without knowing your data at an intimate level: what it is, what it contains, where it lives, and so on.

To make things even more challenging, your data can now reside on more systems than ever – cloud, mobile devices, local machines, or company networks. And while some of that data may need no protection at all in terms of compliance or privacy issues, several other data types absolutely must be protected. Some examples are:

  • Controlled unclassified information (CUI)
  • Payment card information (PCI)
  • Personal health information (PHI)
  • Personally identifiable information (PII)

That’s not even counting company financial data, HR data, trade secrets, or even (for those who deal with government and military) classified information. Leaving this kind of data exposed risks the ire of regulators willing and able to hand out swift and harsh punishments. Failure to protect any of the above likely means a serious (and costly) breach of regulations like the EU’s General Data Protection Regulations (GDPR), California Consumer Privacy Act (CCPA), Health Insurance Portability and Accountability Act (HIPAA), International Traffic in Arms Regulations (ITAR), or NATO STANAGs just to name a few. Depending on what data is exposed, the punishments range from fines and reputational risk to jail time.

There are many compelling reasons to find a solution that works, especially considering the list of global regulatory compliance legislation continues to grow. A flexible data classification solution can help organizations protect their sensitive data at any stage in their privacy journey to assist adherence to regulatory compliance requirements.

Learn more about how our solutions can help you meet data compliance regulations.

Best practices for data classifcation

Data Classification Best Practices

Data classification tools not only help organizations to protect their data, they also help users understand how to treat different types of data with different levels of sensitivity. Automation plays a central role in data governance and helps to maintain the required balance between technology and people-focused operations to achieve an inclusive security culture.

The necessity of an adequate data security backbone and a robust, enterprise-wide security culture have become central concerns for CISOs as a result of the pandemic, with new business demands, changing working environments and the current and future operational constraints of 2020 now taking hold.

As data volumes continue to grow, maintaining the confidentiality, integrity and availability (the CIA triad) of data has become a priority for all security leaders. Managing an ever-evolving data footprint demands a solid data protection posture that includes investment in appropriate data classification tools. To support this, employee education programs should onboard and inform staff around key data management and classification processes. But in all of this, automation is the third critical ingredient for success.

Read our article “Data Security Best Practices Every CISO Should Know.” to learn how to minimize risk through layered security initiatives and best practices

Learn How

A Combined Technology and People-Centric Approach is Essential

Now more than ever, strong data use and protection facilities are required to give employees appropriate and safe access to information and to sufficiently inform and educate them about sensitive data and confidentiality. The provision of automated protection facilities as a central tenet of security posture that will help define, measure and mark the status of data, and to maintain this within secure and authorized repositories, will be paramount. By combining people, process and technology, CISOs can deliver on all key data protection and control requirements; not only with regard to ensuring understanding and appropriate management of data, but delivering the breadth of security coverage required on a local and remote basis and ensuring its suitability for all stakeholders. Combining good data protection technology with human expertise and processes provides considerable benefits that include:
  • The ability to integrate the rigor of technology-based automation alongside the contextual knowledge, use and control requirements of data creators.
  • The use of technology-based automation to assimilate knowledge about data and apply rule-based controls that fit the current and expected future needs of the organization without imposing additional operational overhead.
  • The delivery of a combined security approach that includes the user in the classification decision making process, improves awareness and enhances overall security posture.
Woman working remote
data protection icon

Post-Pandemic Data Protection

At a foundational level, enterprise data protection must extend to ensuring an in-depth knowledge of what data is held and where, and, accordingly, what differing levels of security controls are needed to keep the various data categories safe. From a data protection perspective, businesses must first of all acknowledge that not all data is equal. With that in mind, different controls are required to ensure that differing types of data are not lost or accessed by unauthorized parties. Beyond the high-level requirement to protect confidential, business-critical and sensitive data, businesses must then also apply differing data protection rules applicable to other data categories gathered, used and stored by all businesses. Maintaining a focus on business context and the ability to meet regulatory requirements will be critical into the future, as well as ensuring enterprise-wide understanding around data and risk. Further, prioritization must be given to delivering smart data protection facilities to make the right decisions on data access and availability—to deliver technology-based efficiency and automation to adequately support the ever-increasing data volumes of remote workforces.

Automatic Data Classification for Optimized Security

Businesses that adapt best to the post-pandemic era will use automation, data-driven digital access technologies and cloud to effect improved operations and efficiencies.

With the remote workforce here to stay, more data will be generated outside of the more traditional, secure, on-premises work environment than ever before, and enabling safe user and data access will be key. The sheer volumes of data involved will make it ever more difficult to protect sensitive information and will drive an urgent need for more inclusive and automated forms of data protection.

Automation will make a significant contribution to improved operational efficiencies post-pandemic, as well as delivering agile, automated operations with safe user and data access at the center of their strategies. Data classification tools will protect data by applying appropriate security labels, together with helping to educate users on how to treat different types of data with different levels of classification according to the relative level of sensitivity applied to that document.

Download our “Enhancing Security Automation” brochure to see how security automation can help resolve common data protection challenges.

Download Brochure

Security automation

The Importance of a Strong Security Culture and Employee Education Programs

We have seen how automation plays a key role in establishing a firm foundation for an organization’s security culture, but given that employees play such a vital role in ensuring that business maintains a strong data privacy posture, the ability to work with stakeholders and users to understand data protection requirements and policies is key. Security and data protection education must be conducted company-wide and must exist at a level that is workable and sustainable. Regular security awareness training and a company-wide inclusive security culture within the business will ensure that data security becomes a part of everyday working practice, embedded into all actions and the very heart of the business. A robust data protection protocol is critical for all organizations, and will particularly be the case as we move beyond COVID-19 into the ‘new normal.’ Delivering optimal operational efficiencies, data management and data classification provisions under post-pandemic budget constraints will be an ongoing business challenge. To do nothing, however, will set up an organization to fail, and we have already seen large fines incurred for those that have not made data security a top priority. Data leaders, therefore, must be selective and identify the combination of technologies, processes and people investments that will deliver the greatest security controls. Developing and building a combined technology and user-centric, people-based approach to data protection will be critical. Through a solid security culture and training and the integrated use of technology and automation, data leaders can deliver the most fitting security culture for their organization. Beyond this, success will be contingent on the ability of CISOs to work with stakeholders and users to understand their data protection requirements and to deliver appropriate policies as a central component of overarching data protection strategies. Using data classification software allows organizations to not only identify their data and where it resides, but to then proactively tag each email or file with metadata to ensure it stays identified and protected across various systems or when handled by partners. Effective identification and classification tools typically allow organizations to:
  • Find and identify sensitive data in emails, documents and systems based on various categories. Create and train your system to recognize critical data, such as financial information, proprietary information, or personal information.
  • Apply the right levels of protection through metadata embedded in emails or files, set up automated rules to protect it, and even remind employees to take extra care when handling this kind of data.
  • Build up your security posture in layers – Combine this rich metadata with encryption technology, digital rights management (DRM) software, enterprise rights management (ERM) software, cloud access security brokers (CASB), and next-gen firewalls for a cohesive, integrated approach to ensure strong data protection across global operations. These third-party solutions can be configured to automatically read and understand classification metadata and apply the appropriate controls.

Award-Winning Data Protection

2023 Cybersecurity Excellence Award Winner

In recognition for our enterprise data and protection, Cybersecurity Excellence named Fortra the 2023 winner for:
  • Data Classification
  • Data Leakage Protection​
  • Data Security Platform
  • Data-Centric Security
  • Digital Rights Management
badges gold 2023

How protected is your data?

Meet with one of our experts to assess your needs, and we'll walk you through our solution.

Request a Demo

Upcoming webinar: Webinar name goes here

Join us on Monday, August 32nd where we talk about this, that, and the other thing.

Details + register

Don't show again