The Dark Side of Open Data

Open data offers benefits, but without ethical oversight, it can expose sensitive information and harm vulnerable communities. Learn why organizations must prioritize responsible data strategies.

This article was published in Online Searcher by Abby Clobridge and Eric Hinsdale - November/December 2018.

The environment around research data management and open data has become incredibly complex—and the evolution doesn’t appear to be slowing down at all. At the core of many of today’s challenges are machine learning, natural language processing (NLP), and predictive analytics—the methods used for processing tremendous quantities of data for a variety of intended purposes.

On a daily basis, the news is full of stories from the private sector and government agencies that are mining massive, internally-collected, sets of data for all sorts of outcomes. Technology is making it easier for organizations to become proactive in response to patterns in data. For example, with early alert systems, it is now possible for universities to identify students who might be on the cusp of dropping out early enough for an advisor to intervene. Companies want to mine their customer data to achieve greater profitability and inventory data to forecast demand for products in a timely manner.

But these types of methods aren’t restricted to closed data. In fact, one of the advantages of open data is that it allows data from disparate datasets to be combined—re-mixing or merging many “small data” sets to convert them into “big data.”

From the perspective of funding agencies, this type of re-use is one of the intended benefits of open data.
If datasets use common variables, include well-structured and organized data elements, are deposited into interoperable repositories that can be found by harvesters, and include Creative Commons Attribution (CC-BY) licenses (or other similar license allowing for re-use), other researchers are encouraged to find, access, and re-use these datasets without restrictions.

Re-use without restrictions is what sparks fear in many researchers. Once data has been published and is out in the world, you lose all control over your dataset. It can be used, combined, and repurposed in all sorts of ways—including ways you never considered, ways that could potentially put someone else in harm’s way, or for more morally-ambiguous purposes.

Although we’re proponents of open data, it’s useful to know about some incidents where open data has led to problems.

Read the full article in Online Searcher

AI That Acts: Why Agentic AI Changes the Risk Equation

Agentic AI is changing how organizations use AI — and introducing new risks. Learn the key differences between chat-based AI and AI that takes action, and why governance matters more than ever.

Abby Clobridge

AI Readiness · Feb 26, 2026

Over-Documentation Is Not Knowledge Management: Designing Systems for Clarity and AI Readiness

Most organizations don’t struggle with a lack of documentation — they struggle with poorly designed knowledge systems. Over-documentation creates friction, slows decisions, and undermines AI initiatives. Effective knowledge management is not about capturing everything. It’s about structuring signal.

Abby Clobridge

Digital Strategy · Feb 24, 2026

The Problem with "Best Practices" in Technology Strategy and Digital Transformation

“Best practices” promise certainty in technology strategy and digital transformation. But without context — your governance model, knowledge structure, and mission — they often create friction instead of clarity.

Abby Clobridge

The Dark Side of Open Data

Read next

AI That Acts: Why Agentic AI Changes the Risk Equation

Over-Documentation Is Not Knowledge Management: Designing Systems for Clarity and AI Readiness

The Problem with "Best Practices" in Technology Strategy and Digital Transformation

Let's keep in touch.