To get the most from your data classification efforts, learn to think differently
If you’re implementing a risk-based schema, for it to go smoothly and to avoid confusion you need to take into account how others will perceive your classification labels.
The classification of sensitive internal data may seem relatively straightforward. And in many cases, it is.
But not always.
Because just like a Rorschach test measures differing perceptions of the same ink blots, it’s important to remember that your classification schema and the words you use can mean very different things to different people – and that any confusion in this regard can sow disaster from a data protection standpoint.
Indeed, when thinking about the importance of perspective in data classification, I’m reminded of this line from Obi-Wan Kenobi in the 1983 classic, Return of the Jedi:
For many this is a throwaway line in an old movie, used to dig itself out of a plot hole.
For me, though, it’s an epiphany.
I’ve been consulting in cybersecurity, with a focus on data protection and privacy, for the past several decades and in that time one statement I’ve come across more than almost any other when talking to customers is: “I don’t know, you tell me – you’re the expert!”.
Initially this statement can be easy to shrug off, or to believe your own hype because, after all, “you’re the expert”, and immediately go into lecture/problem solving mode where you then tell the client exactly what to do.
The problem is, of course, that they’re not actually asking you to tell them what to do, but rather provide the benefit of your hard-earned understanding and experience so they can make informed decisions about a subject they are much less familiar with than yourself.
It’s a sometimes-humbling thought to realize that, unless an expert knowledge framework is provided, it’s more likely they’ll take what you say as absolute truth almost without question.
The recent Titus white paper entitled The Titus standard: How Metadata will accelerate your data protection strategy provides this kind of framework – a broader foundation and education around the importance of metadata, and how it can help empower businesses to make informed decisions.
But there are a few areas that deserve an even deeper dive.
Can you explain more about metadata & field clash?
Every so often I get a notification on my phone about photographs I took on the same day in years past, including information on where those photos were taken.
This is a great example of two phenomenal uses of metadata: Where and When.
As these tools get ever smarter, they’re also adding an element of Who through facial recognition technology.
Conveniently, all this valuable information is tucked neatly away within the metadata for further enrichment.
For me, this is a really cool feature. But it’s also easily endangered when my camera mixes with other systems, such as a storage platform.
To illustrate what I mean, let’s look at the metadata field names in play here: LOCATION and DATE. The LOCATION is where I took the photo, and the DATE is when.
Seems straightforward enough, right?
But consider this example from a different point of view:
I take some great photos and upload them to my long-term storage. Everything’s going along fine at this point, but unfortunately what I don’t realize is that the storage platform also has fields called LOCATION and DATE – and for this system, those field names denote the location of the storage and date I uploaded the photos.
Thanks to this confusion, the storage system overwrites the camera’s metadata and all that cool information I mentioned is lost forever.
This tragic situation, however, is easily avoided.
By having unique field names, we mitigate the risk of the data being overwritten by other systems or solutions. The end result? I don’t lose the valuable context I love so much.
Can you explain classification a little more?
Folks who implement a classification solution are typically in highly regulated industries such as finance, healthcare, government, and defence.
Fittingly, conversations with people from these industries usually revolve around terms like classification; levels; and sensitivity labels like public, internal, confidential, and secret.
Again, however, it’s too easy to slip into thinking only of classification within these confines.
By its very definition classification is about putting similar things into boxes. As such, classification schemas are only limited to the bounds of your imagination.
By looking at what we do, the information we gather, and considering what additional pieces of data can be considered, we also challenge our ingrained presumptions of what we need. These new ways of thinking can provide phenomenal value to an organization.
What should I call my classification levels, and in what order should they be?
The most popular classification levels by far are “public, internal, confidential, and secret”. These look like a pretty solid set – straightforward, to the point, and easy enough to understand. And on paper, that’s true.
Unfortunately, however, the minute people become involved, we begin to hit snags. Words, after all, often have power and meaning beyond what we tend to think.
Let’s consider a fictional company – RANOGA – and two documents: one confidential, the other internal. You read that there has a data breach and documents have been lost. For our little test, we will imagine three different headlines:
Headline 1: RANOGA has had a data breach and documents have been lost
Headline 2: RANOGA has had a data breach and Internal documents have been lost
Headline 3: RANOGA has had a data breach and Confidential documents have been lost
Question 1: Based on the headlines above, do you think people will feel differently about the impact of the incident?
- Do you feel losing “documents” has less, the same, or more impact, than Internal or Confidential?
- Do you feel losing “Confidential” information has less, the same, or more impact, than “Internal” information?
Question 2: You need to put a dollar value on these. If you had to pick a value, what would it be?
- Document: $10, $10,000, or $1,000,000
- Internal: $10, $10,000, or $1,000,000
- Confidential: $10, $10,000, or $1,000,000
When I’ve posed similar scenarios to customers, especially to a group of people in a room, I’ve found that there are many different opinions and perspectives which usually lead to unpredictable results. These conversations usually last longer and go deeper than the team expects.
What makes this such a tricky game is that people’s brains tend to fool them into making word associations, subconsciously linking each term to a level of risk and presuming a context.
Did you notice I never gave definitions of “Internal” or “Confidential”, and what they mean to RANOGA?
The big lesson here is that if you’re implementing a risk-based schema, for it to go smoothly and to avoid confusion, you need to take into account how others will perceive your classification labels.
As a rule, people’s brains stubbornly cling to certain points of view.
Luckily, with a little bit of extra information, education, and context we can empower people to rethink and re-evaluate how they approach what their business needs from classification and metadata, while being cognizant of their impacts and the necessary considerations to make it a success.
Here’s a personal challenge, from me to you: Look at things from a different point of view than the one you may already have.
Think about how a deeper dive on classification and metadata for better understanding can bring better results and accelerated time to value.
In a world that’s always changing, sometimes rapidly and unpredictably, being agile in the way we think and being able to challenge what we always held true drive positive outcomes further and faster than before, and most of all to be successful.
I promise you it will be time well spent.