Imagine, for a minute, that you’re about to build a house. You wouldn’t just throw down some bricks on what looks like solid ground, right? I presume you’d want to make sure the foundation was stable first. Well, the same should apply to the data we use. If you don’t know where it’s from, how it’s been put together, or if it’s right for the job at hand, you might as well be building on shifting sand. It might seem alright for a while, but soon enough, everything’s going to fall apart.
It’s a bit like reality TV. At first glance, things usually seem pretty straightforward. But you quickly realise that appearances are often misleading. Relationships are messier than they seem, decisions are more complicated, and the truth is rarely black and white. It's the same with data. If we don’t ask the right questions about data and really dig into where it comes from and what’s been left out, we could end up causing more harm than good. And when the stakes are real people’s lives and livelihoods, that's something we should all take seriously. The truth is, there is only one way forward here. The only way is ethics.
You see, it's easy to get legality and ethics mixed up. But ticking the right legal box isn’t enough. Following the rules is easy. But acting responsibly? Not so much.
Legal compliance tells you what the bare minimum is when handling data, but it doesn’t always cover the important stuff. It won’t tell you if what you’re doing is actually fair, right, or sustainable. And if you think about it, laws move slowly, so by the time they catch up to what’s actually going on, the damage has often already been done.
Think about those apps that collect vast amounts of personal information through pre-checked boxes. Technically, it’s all above board. It's all legal. But ethically? That’s where things get trickier. If people knew exactly how their data was being used, tracked, sold, or analysed - well, let’s just say, “I do not consent” would probably be getting a lot more clicks.
Ethics is about asking "If people could see the full picture, would they feel respected? Or would they feel like they’ve been used?" It’s a bit like what happens on reality TV. What’s allowed isn’t always what’s right. What's shown isn't always what's true. Things look one way on the surface, but you don’t know what’s really going on behind the scenes.
"If people could see the full picture, would they feel respected? Or would they feel like they’ve been used?"
Considering data ethics is even more important now that AI is at the heart of many decision-making processes. AI models are everywhere now, making decisions that affect us in ways we might not even realise. From job applications to insurance premiums, from credit scores to healthcare decisions - AI is driving it all. But many of these decisions come from "black box" systems - powerful algorithms that find patterns no human could pick out. The problem is, often no one can explain exactly how they do it.
Imagine you apply for a loan, and an AI system decides whether you’re eligible. It says no. Why? Who knows? It just does. But, if we can’t understand how these systems come to their conclusions, how can we trust them?
The real trouble is that these black box systems can inherit biases from the data they’ve been trained on. So, if past hiring practices or healthcare systems were biased against certain groups, AI will happily continue those trends, only faster and on a larger scale. It’s like reality TV where the script’s rigged, but no one knows it. That’s a problem, and we need more transparency, not less.
Bias in data isn’t usually deliberate. Most of the time it sneaks in just because no one thought to ask the tough questions. And poor data quality? Well, that can cause just as much harm. Missing data, outdated information, inconsistent coding - all those small flaws can snowball when they’re used to make decisions that affect millions of people.
And, unfortunately, the harm isn’t spread equally. It tends to hit hardest where it’s already been hardest. If you were excluded from the data in the first place then it's likely you’re going to keep getting left out.
Real-life examples aren’t hard to find:
Healthcare algorithms that don’t take into account the specific needs of various ethnic groups, because the data was built on unequal access to care.
Credit scoring systems that penalise people based on their postcode, reinforcing socioeconomic divides.
Hiring platforms that learn to favour certain universities or backgrounds, just because they were popular in the data, without questioning if they’re actually relevant to the job.
When we don’t question data, when we just accept it at face value, we risk making the same mistakes over and over again. It’s like watching a reality TV contestant getting edited to look bad. But this isn’t TV; it's not entertainment, it’s real life. And the consequences are a lot more serious.
When we don’t question data, when we just accept it at face value, we risk making the same mistakes over and over again.
And the context of our data matters. It’s not just about what the data shows; it’s about where it comes from, who collected it, and why. But here’s the thing: if the way we use the data changes, the context has to change, too.
Take the “collect once, use many times” approach. Sounds good in theory, doesn't it? But it only works if the context of the data stays the same. If data was collected for one reason, let’s say tracking health outcomes, but then repurposed for something else, like determining insurance premiums, then the context has shifted. And if you don’t adjust for that, you’re bound to get things wrong.
Imagine using health data collected during the Covid pandemic to inform long-term health policy. The pandemic threw all sorts of things out - health behaviours, treatment effectiveness, etc. If that data isn’t adjusted to reflect those anomalies, you’re making decisions based on a snapshot of an extraordinary situation, not the bigger picture.
You see, the context isn’t just about the data itself. It’s about how the data’s used. If it was gathered for one purpose, but gets used for something completely different, that shift could lead to decisions that aren’t fair or accurate.
A healthy data culture isn’t about getting it perfect. It’s about being aware, staying awake, and asking questions. So what does it look like in practice?
Making sure the context is always part of the conversation, especially when it comes to data collection, analysis, and decision-making.
Questioning black box systems instead of just trusting them because they’re “smart.”
Owning up to bias and trying to fix it, instead of pretending it’s not there.
Treating data literacy as a fundamental skill for everyone, not just the people who "do" data.
Building trust by being transparent, not assuming people will just give it to you.
A healthy data culture isn’t about getting it perfect. It’s about being aware, staying awake, and asking questions.
It’s about fostering a culture where everyone feels comfortable asking, “Wait, are we sure about this?” Where data analysts, developers, managers and leaders are all able, and encouraged, to challenge how data is being used.
Just like reality TV, data might look all shiny and impressive at first glance, but that doesn’t mean it’s the full story. A big dataset. A fancy algorithm. A polished project. But none of that matters unless we start asking: Where did this data come from? Who’s missing from the data? And who’s going to get hurt if we keep using it in this way?
Ethical data practice isn’t just something nice to do on the side. It’s the only way to build systems that are truly fair and transparent. It’s easy to get distracted by the flashy stuff. But when you really look closely, at the data, at the context, at the impact, that’s where the real work happens. Because if we care about making things better, the only way is ethics.
Prioritise fairness and transparency. Make it easy for people to understand how their data is being used. Give them real choices.
Treat data literacy as a core skill. Everyone working with data, from executives to analysts to frontline staff, should be able to spot when something does not look right.
Ethical data practice is not about saying no to technology. It is about saying yes to responsibility.
Ethical data practice is not about saying no to technology. It is about saying yes to responsibility.
In the world of data, what looks shiny and impressive on the surface can hide deeper risks underneath. It's not enough to trust our instincts or assume that clever technology knows best. Ethics demands that we slow down, ask better questions, and stay uncomfortable with easy answers.
The real world is messy, complicated, full of hidden stories and unequal histories. The only way to use data well is to respect that complexity, not hide or ignore it.
Because in the end, the only way is ethics.