“Test, Time, Scaling is Going to go Through the Roof”
2025 CES Nvidia’s Jensen Huang
Artificial Intelligence is rapidly transforming how we use technology not just in our daily lives but also to address some of the world’s biggest challenges, from healthcare to climate change. NVIDIA is laying the foundation of the AI revolution and their GPU chips are the workhorse for training AI models. As models continue to become more capable and bigger in parameter size, it comes at the cost of rising electricity consumption and energy demands. While the energy used by AI systems can vary depending on the task complexity, they generally require significant amounts of energy. Early applications were estimated to require about ten times the electricity—from 0.3 watt-hours for a traditional Google search to 2.9 watt-hours for a Chat- GPT query—to respond to user queries. (1)
The compute and energy required for AI can be divided into two parts, training compute and inference. Energy consumption for training is dependent on the Data Center design, the energy mix at the location, GPU hardware used, Neural network architecture like transformers, # of epochs / training runs, parameters and size of the dataset. Whereas inference compute is when the model is deployed on cloud environments for prompting and general use. With the most recent advances in reasoning, Agentic AI systems where a bunch of models are coordinating together, test time scaling and the compute required at inference time is going to go “through the roof”. (2)
Data Center and its rising energy demands
Behind all the AI boom is a massive network of Data Centers Infrastructure. Investment in Data Centers around the world and in the United States continues to soar. In 2023, overall capital investment by Google, Microsoft and Amazon, which are industry leaders in AI adoption and data center installation, was higher than that of the entire US oil and gas industry – totalling around 0.5% of US GDP. (3) At the CES, Arm recently estimated that the US Data centers would increase 13x, from 1.5% 2021 of the US energy footprint to over 20% by 2030.
Data Center investment in the United States from 2014 to 2024 (indexed to Dec 2019 to 1)
Energy aware computing using pebble-falcon
Leveraging the GPU efficiency and the recently announced Nvidia Blueprint ecosystem, Pebble-Falcon is positioned to take advantage of the massive opportunity in tracking and reducing the carbon and water footprint of workloads running on Cloud and Data Centers. Pebble-FalconTM is a purpose built Energy management solution to deliver the industry’s first multi factored sustainability cloud solution that uses agentic AI to maximize performance, energy efficiency, security, redundancy, carbon and water reduction for workloads running on cloud environments. While cloud providers like AWS (3), GCP (4)(5) both offer carbon monitoring and intelligent tools to reduce energy usage and use carbon-free energy as much as possible, our objective is to offer a differentiated view by offering energy efficient optimization in a single pane of glass for single or hybrid / multi-cloud environments at the application level in near real time.
Optimizing for AI models for Energy Efficiency
For AI models, we measure the carbon footprint for training, if the information around GPUs, Data Center is available and inference. We can gather information on what cloud region it is running on and using Electricity mix determine the energy mix and carbon intensity to calculate the carbon footprint of serving these models.
Not all tasks require the best and the largest model to achieve results. The compute requirement to solve a tough math problem would be vastly different than answering a simple factual question like what is the capital of France? The spread in energy efficiency varies significantly depending on the model used for each of the text generation, image classification, and summarization tasks, as research suggests (7). We’d have access to the user prompt, model response, model parameters, gpu, cloud metrics, host, device (edge) and can offer recommendations on what model, host, gpu, device (edge) would be the most appropriate and energy efficient. The energy efficiency ratings of the model is available on HF: AI Energy Score Leaderboard. (8)
NVIDIA Developer Program
At pebble, we are integrated in the NVIDIA AI Developer Ecosystem as we build out our set of tools for energy management. For any fine-tuning and model training needs, we plan to employ NEMO framework and leverage. And in order to deploy the models efficiently, we employ NVIDIA NIM (NVIDIA Inference Microservices), which allows us to run AI models on NVIDIA GPUs on AWS managed services like EKS, SageMaker and on edge devices. We are excited about NVIDIA's Agentic framework Blueprint as we go from prototyping Agents to deploying them in production environments.
In closing, Pebble-FalconTM empowers organizations to significantly reduce the energy consumption of their AI workloads, enabling a more sustainable approach to cloud computing. This leads to both cost savings and a measurable reduction in the carbon footprint of AI-driven operations.
1 https://www.epri.com/research/products/3002028905
2 https://www.forbes.com/sites/johnwerner/2025/01/09/jensen-huang-at-ces-agentic-ai-big-hardware-and-more/
3 https://www.iea.org/commentaries/what-the-data-centre-and-ai-boom-could-mean-for-the-energy-sector
4 https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/ccft-estimation.html
5 https://blog.google/outreach-initiatives/sustainability/carbon-aware-computing-location/
6 https://cloud.google.com/sustainability/region-carbon
7 https://www.nature.com/articles/d41586-024-02680-3
https://huggingface.co/spaces/AIEnergyScore/2024_Leaderboard
We accelerate climate action by empowering businesses to reduce their carbon footprint. Our focus on transparency, accountability, and impact drives progress in carbon offsetting, renewable energy, ocean conservation, and biodiversity protection. Together, we build a sustainable future.