Start a conversation Client Portal

Unveiling the Hidden Cost of Data: How Businesses Can Illuminate Dark, Lost, and Hidden Data for Sustainability and Efficiency

By Thomas W. Jackson and Ian R. Hodgkinson

April 2025 – In today’s digital age, data is often referred to as the “new oil.” However, unlike oil, data is not a finite resource with total global data creation growing exponentially. Businesses are generating, storing, and processing vast amounts of data, but not all of it is useful or even visible – the impact of this ‘bad’ data represents a significant cost to organizations and the economy. This is where the concepts of structured data, unstructured data, dark data, lost data, and hidden data come into play. Understanding these types of data and how to manage them effectively is crucial for businesses aiming to reduce their digital carbon footprint and improve operational efficiency.

The growing prevalence of AI is a further compounding effect. With the UK government’s ambitious goal to increase public-controlled AI computing power twentyfold by 2030, the demand for electricity is set to skyrocket. To illustrate, the amount of electricity required for global AI computation in 2026 will be similar to the amount of electricity consumed annually by Finland. This raises critical questions about whether renewable energy can meet the demand while supporting broader electrification goals. In this article, we’ll explore the different types of data, the challenges of dark, lost, and hidden data, and how businesses can turn these data types into “light data” while contributing to sustainability efforts.

Understanding Structured and Unstructured Data

Structured Data
Structured data is the backbone of most business operations. It refers to data that is organised in a predefined format, typically stored in relational databases. Examples include customer information, transaction records, and inventory data. Structured data is easy to search, analyse, and process, making it highly valuable for decision-making and operational efficiency.

However, even structured data can become problematic if not managed properly. Redundant, obsolete, or trivial (ROT) data can accumulate, consuming storage space and energy without providing any value, resulting in a large amount of data waste. Regular audits and efficient storage solutions are essential to ensure that structured data remains relevant and useful.

Unstructured Data
Unstructured data, on the other hand, lacks a predefined format. It includes emails, videos, social media posts, sensor data, and more. According to the pioneering OASIS White Paper, unstructured data are growing rapidly, driven by the proliferation of IoT devices and digital communication tools. While unstructured data can hold valuable insights, extracting meaningful information from it requires advanced tools like Natural Language Processing (NLP) and AI.

The challenge with unstructured data is that it often goes unutilised, contributing to the growing problem of data waste. Without proper management, unstructured data can become a significant drain on resources, both in terms of storage and energy consumption.

The Growing Problem of Dark Data, Lost Data, and Hidden Data

Dark Data
Dark data refers to information that organisations collect, process, and store but fail to use effectively. Take the example of sensor data generated by Internet of Things devices, as much as 90% of these data are never used. This unquantified and untapped data was reported by Splunk to amount to more than three quarters of the data held in storage for one third of UK organisations. This unused data not only represents a missed opportunity for insights but also contributes significantly to energy consumption and carbon emissions. A recent use-case presented in the World Economic Forum’s recent white paper on AI and energy, suggests that around 10-20% of dark data can hold additional value with the rest to be evaluated for deletion.

Lost Data
Lost data is a subset of dark data that is stored but effectively “lost” within data centres or devices due to poor labelling, organisation, or maintenance. This includes backup or archive data that hasn’t been properly maintained, as well as log files that could provide valuable insights into data usage patterns but are difficult to locate. Lost data is particularly problematic because it consumes storage resources and energy without providing any value, yet it often remains untouched due to the difficulty of retrieving it.

Hidden Data
Hidden data refers to information embedded within existing datasets that requires extraction or manipulation to become useful but cannot be immediately seen. David Hand offers a critical analysis of dark data types and emphasises how hidden data—data we cannot see—may bias data interpretation resulting in sub-optimal decision-making. Examples of where hidden data may be found include documents, emails, videos, or audio files that need further processing to reveal their value. Hidden data often remains hidden because such data is not immediately accessible or understandable. However, with the right tools and techniques, hidden data can be transformed into valuable insights.

Types of Dark, Lost, and Hidden Data

The OASIS White Paper categorises dark data sources into four main types, which also apply to lost and hidden data:

1. Traditional Structured Data: Data that is manually input or generated but no longer actively used.
2. IoT Data Streams: Data captured from IoT devices that lack sufficient context or tagging.
3. Unstructured Data: Information embedded in formats like videos, emails, or documents that require extraction to be useful.
4. System-Generated Data: Log files and other data created by systems that are rarely, if ever, accessed.
Dark, lost, and hidden data is hidden costs for businesses. They consume energy, increase storage costs, and complicate data management. However, with the right strategies, businesses can turn these data types into “light data”, data that is actively used and provides value.

Turning Dark, Lost, and Hidden Data into Light Data

1. Data Classification and Metadata Management
The first step in addressing dark, lost, and hidden data is to classify data effectively. Metadata, data about data, plays a crucial role in this process. By tagging data with metadata, businesses can systematically identify its relevance and utility. For example, metadata can indicate when data was created, last modified, and how frequently it is accessed. This information helps organisations decide whether to retain, archive, or delete data.

Automated tools and AI-driven solutions can streamline metadata management, reducing the manual effort required to classify and tag data. These tools can also help identify patterns in data usage, enabling businesses to prioritise valuable data and eliminate unnecessary storage.

2. Regular Audits and Data Minimisation
Conducting regular audits of digital records is essential for maintaining an efficient data management system. Audits help identify redundant, obsolete, or trivial (ROT) data that can be safely deleted or archived. Data minimisation strategies, collecting only the data necessary for a specific purpose and retaining it only as long as needed, can further reduce the accumulation of dark, lost, and hidden data.

3. Efficient Storage Solutions
Implementing efficient storage solutions is another critical step. Centralised data management systems and tiered storage strategies can help organisations optimise their storage use. For example, frequently accessed data can be stored on high-performance systems, while less critical data can be moved to lower-cost, energy-efficient storage. Cloud services with green certifications can also reduce the environmental impact of data storage and processing.

4. Leveraging AI and Machine Learning
AI and machine learning can play a pivotal role in managing dark, lost, and hidden data. These technologies can analyse large datasets to identify patterns, extract insights, and automate data classification. For example, AI can help convert unstructured data into structured formats, making it easier to analyse and use. Additionally, AI-driven energy management systems can optimise the energy consumption of data centres, further reducing the carbon footprint of digital operations.

5. Extracting Value from Hidden Data
Hidden data, often embedded in unstructured formats, can be unlocked using advanced tools like Natural Language Processing (NLP) and AI. For example, NLP can extract keywords, entities, and sentiment from text documents, while AI can analyse video and audio files to uncover valuable insights. By investing in these technologies, businesses can transform hidden data into actionable information.

6. Recovering Lost Data
Lost data, often buried in poorly maintained archives or log files, can be recovered through systematic data audits and the use of advanced search tools. By identifying and cataloguing lost data, businesses can determine its value and decide whether to retain, archive, or delete it. This process not only frees up storage space but also ensures that valuable data is not overlooked.

Can Renewable Energy Meet Future Data Demand?

While the UK has made significant progress in renewable energy production, scaling renewable energy at the required pace to meet increasing energy demands from the explosion of new data creation will be challenging. Renewable generation has grown at an average annual rate of 5-7%, but maintaining this trajectory through 2030 will require overcoming technical, financial, and spatial constraints.

The upshot of this is that we face a data doomsday scenario by 2033, where energy demand from data will outstrip global electricity production if current forecasted trends continue. In the shorter term, the demands of the digital infrastructure could outstrip renewable energy supply by the end of 2025. With the compounding impact of heating and transportation electrification, the consequence will be an increased reliance on fossil fuels during peak demand periods.

By turning dark, lost, and hidden data into light data, businesses can not only improve their bottom line but also contribute to the global effort to combat climate change. The time to act is now, sustainable data management is not just a trend; it’s a necessity for the future.

For more information about Digital decarbonisation, visit www.oasisgroup.com or contact digitaldecarbonisation@oasisgroup.com. 

Article commissioned by OASIS Group.