Sustainability

Green Coding and IT Energy Consumption: Managing the Data Flood

In our ongoing exploration of Green Coding and IT Energy Consumption, we have thus far discussed various strategies and technologies to reduce the carbon footprint of the information technology sector. From optimizing code and adopting energy-efficient hardware to embracing renewable energy sources, these approaches are crucial steps towards a greener future. In this seventh installment of our series, we will address another critical aspect of the IT landscape: data, its exponential growth, and the energy implications associated with its management.

The Flood of Data

Data, often referred to as the lifeblood of the digital age, is at the core of every IT operation. It is generated, consumed, and analysed on an unprecedented scale. The sources of this data are diverse and include analytics, application instrumentation, social media activities, and user-generated content such as photos, videos, and text. This continuous influx of data is both a blessing and a challenge for companies in today’s data-driven world.

Data Collection and Management Challenges

While data can be a valuable asset, it can also be a source of waste if not handled correctly. Inefficient data collection practices, high collection frequencies, storage in inefficient formats, and unnecessary data retention can contribute to this waste. As applications evolve and device sensors become more accurate, the volume of data generated increases exponentially.

User behaviour also plays a crucial role in data growth. With a shift towards video content, which consumes more data per unit of information, and the preference for personalized data delivery, data transmission needs have grown significantly. This shift from broadcast to unicast data transmission has implications for energy consumption.

Data on a Global Scale

The scale of global data volumes is mind-boggling. According to IDC, the total data volume in 2018 was estimated at 33 zettabytes (1 billion terabytes), and this figure is projected to skyrocket to 175 zettabytes by 2025. This growth in data is staggering, and it’s a testament to the digital transformation that the world is undergoing.

Interestingly, while data storage is shifting towards centralised repositories like data centres and the cloud, the majority of data generation still occurs on endpoint devices. This dichotomy creates substantial data transfer requirements between endpoints and centralised solutions.

In 2017, consumer-generated data accounted for 47% of all data, but this is expected to decrease to 36% by 2025, with businesses generating the rest. However, there is limited statistical information available on the energy consumption of data storage, making it challenging to quantify its environmental impact accurately.

Unknown Energy Consumption

Unfortunately, there is no statistical information available on the energy consumption of data storage, and various estimates are outdated and include both storage and data transfer costs. Furthermore, data replication should be considered. Especially in cloud services and data centres, multiple copies or replicas of data are stored as a precaution to ensure data availability in the event of hardware failures.

Data transfer also poses a substantial energy challenge. According to a UNCTAD report, global monthly data transfer reached 230 exabytes in 2020 and is projected to triple to 780 exabytes by 2026. While this may seem small compared to IDC’s data volume estimates, it’s important to note that data transfer involves additional energy costs.

Consumer data transfer is expected to continue growing, driven by factors such as improved video resolutions, increased use of videos on social media, expansion of internet usage into new scenarios, and the use of AR, VR, and AI solutions. The primary driver of this growth is video consumption, which accounted for 60% of mobile data transfer in 2022 and is projected to increase to 72% by 2030.

Data-Intensive Technologies

The growth in data is not solely due to changing user behaviour. Emerging technologies like artificial intelligence and online advertising rely heavily on vast datasets. Training data for AI models is massive and constantly expanding to meet the precision requirements of these systems. For example, the Internet archive and data collection produced by Common Crawl, which contribute to training models like GPT-3, reached 380 terabytes in October 2022.

All these figures demonstrate the enormity of the challenge of managing data volumes in a sustainable manner. Data transfer, storage with replication, and processing collectively contribute significantly to the energy consumption in the IT industry, and by extension, its carbon emissions.

Conclusion

In conclusion, the data deluge is an undeniable reality of our digital age. As data continues to grow exponentially, it brings with it energy and environmental challenges that must be addressed. While energy-efficient technologies and renewable energy sources are essential components of the green IT movement, managing data more intelligently and reducing unnecessary data replication are equally vital steps towards a more sustainable future.

As we move forward, it’s imperative that we strive for a harmonious balance between the benefits of data-driven insights and responsible data management, all while working towards greener IT solutions. Only through holistic efforts can we hope to mitigate the environmental impact of the digital age and make a meaningful contribution to a more sustainable world.

In our next and final blog post, we will present a comprehensive set of recommendations for key stakeholders in green software development. These insights will empower software developers, component developers, designers, testers, software companies, buyers, and users to take proactive steps towards a more energy-efficient and environmentally friendly software ecosystem. Stay tuned for valuable guidance on building a sustainable digital future.