Back to news

Data reduction for green artificial intelligence

Set 26, 2024

September 26, 2024Artificial intelligence (AI) has become a big part of our daily lives, helping industries and making many tasks easier, from personalized recommendations to self-driving systems. However, as the use of AI grows, there are concerns about how sustainable it is, especially since training AI models requires a lot of resources. These models require large amounts of reliable data and often go through long and repetitive training processes. This requires a huge amount of computing power, which leads to high energy consumption and significant CO2 emissions. In addition, training these models can take a long time, further increasing the environmental impact. As AI continues to grow, it’s increasingly important to find ways to reduce its carbon footprint and make it more efficient for sustainable development.

The USE team has been working on the development of “Greener AI”, with the goal of making AI more environmentally friendly and efficient. The key to our approach is the use of data reduction techniques, i.e. training models with only a carefully selected fraction of the available data. To achieve this, we apply a concept of data representativeness, which ensures that the chosen subset accurately reflects the entire dataset. This allows us to maintain model performance while significantly reducing computational costs in terms of both CO2 emissions and training time. By training with fewer data points, we aim to balance the effectiveness of AI with sustainability, moving towards a greener future in technology.

We applied this approach to a model for detecting people and people in wheelchairs using the YOLO machine learning model. By training the neural network on both the full dataset and a smaller, representative subset, we achieved nearly the same level of performance, but with far less CO2 emissions and much faster computation time. This demonstrates the power of our Greener AI approach. In the video below, you can see the detection of people and people in wheelchairs with both models, showing how they deliver nearly identical results, even with a smaller training dataset.

All results and more detailed information are discussed in our paper “An in-depth analysis of data reduction methods for sustainable deep learning”, which was accepted at the Open Research Europe platform. You can access it here: https://doi.org/10.12688/openreseurope.17554.1 You can also find a QR code for the paper at the end of the video.