Dark Data Steps Into the Infrastructure Spotlight

Sensors are everywhere – turning off highway lights when roads are empty, monitoring the health of bridges, and monitoring the intricate dance of telecommunications networks and electrical grids. Every flicker of these sensors is a byte of data, meticulously logged and stored. With the costs of data storage plummeting over the past decade, we’re talking about an avalanche worth of data digitally warehoused.

Much of this data has been resting in the dark, unanalyzed and unseen. This is what experts call dark data. And now, as AI steps into the infrastructure arena, this dormant data is about to step into the spotlight.

“Indeed, there appears to be an enormous amount of data collected on infrastructure operations that could be better used to improve their effectiveness,” said IEEE Life Senior Member Raul Colcher.

AI thrives on data – the more, the better. And when it comes to training sophisticated AI models, this dark data, collected over years from myriad sensors and systems, may be extremely valuable.

So, what’s the big deal about bringing this dark data to light? For starters, it’s a game-changer for infrastructure operations. With AI algorithms churning through mountains of previously unused data, we can expect leaps in efficiency and new ways to design and use our infrastructure for a future where data moves more frequently than people.


Much of the time, dark data isn’t used because it isn’t properly tagged, and is therefore difficult to analyze. Some research suggests that the machine learning algorithms that allocate resources within mobile phone networks could be greatly improved with the use of dark data. In another case, data scientists at an oil and gas plant were able to use dark data to improve a digital model of the plant without disrupting operations.


The benefits of analyzing and modeling this data are vast and varied. From planning to operations, maintenance, and beyond, every facet of infrastructure could see a transformation. Picture more accurate models, better automation, and a deeper understanding of how our systems truly work.


But, it’s not all smooth sailing. Dark data, while abundant, isn’t always clean or error-free. Questions of bias, data provenance, and security loom large. How we address these challenges will be crucial in unlocking the full potential of AI in infrastructure.

“The surge in data quantity doesn’t guarantee better results,” said IEEE Member Qi Qi Wang. “Filtering out disruptive or poor-quality data presents a substantial challenge.”

Learn more: 2023 was a landmark year in AI, as broad swaths of the public became more aware of AI thanks to the power of generative AI tools. IEEE Spectrum covered developments in-depth. Check out their run-down of the top AI stories in 2023.