
Image by Emiliano Vittoriosi, from Unsplash
Which AI Is Greener? Study Reveals Differences In Emissions
A new research shows how advanced AI models, which we deploy in our daily life, are generating a substantial environmental impact.
In a rush? Here are the quick facts:
- Advanced AI models emit up to 50 times more CO₂ than simpler ones.
- Reasoning AIs like o3 and R1 use more energy for longer answers.
- Logic-based queries, like math or philosophy, spike emissions significantly.
Large language models (LLMs) designed for deep reasoning — such as OpenAI’s o3, Anthropic’s Claude, and DeepSeek’s R1 — produce 50 times more carbon dioxide emissions than basic AI models, when answering identical questions.
“The environmental impact of questioning trained LLMs is strongly determined by their reasoning approach,” said Maximilian Dauner, lead author of the study published June 19 in Frontiers in Communication. “We found that reasoning-enabled models produced up to 50 times more CO₂ emissions than concise response models,” he added.
The research reports how these emissions come from the heavy computing power required to process advanced queries, such as questions around logic-heavy areas like algebra and philosophy.
The researchers explain how these reasoning models use a method called “chain-of-thought,” where the AI breaks down a problem into logical steps, which mirrors human problem-solving approaches. More tokens during the process generate longer responses, which in turn consume additional energy.
The researchers conducted their analysis by running 1,000 questions through 14 LLM models. They determined the energy consumption by using an NVIDIA A100 GPU and assuming each kilowatt-hour created 480 grams of CO₂.
The analysis showed that on average, reasoning models produced 543.5 tokens as output in each response, whereas simpler models generated only 37.7 tokens. The most accurate model, Deep Cogito (with 72 billion parameters), also had one of the largest carbon footprints.
“Currently, we see a clear accuracy-sustainability trade-off inherent in LLM technologies,” Dauner explained. “None of the models that kept emissions below 500 grams of CO₂ equivalent achieved higher than 80% accuracy,” he added.
For instance, answering 60,000 questions with DeepSeek’s R1 model would emit as much CO₂ as a round-trip flight between New York and London. Meanwhile, Alibaba Cloud’s Qwen 2.5 model could provide similar accuracy at one-third the emissions.
This isn’t just about emissions per prompt, the researchers say that broader concern is scale. A single question might release only a few grams of CO₂, but multiply that across billions of users, and the footprint becomes massive.
The New York Times reports that a 2024 report by the U.S. Energy Department predicted that data centers will consume up to 12% of the national electricity supply by 2028, which represents a triple increase from 2022 levels, where AI is a driving factor.
So what can users do?
“Use AI when it makes sense to use it. Don’t use AI for everything,” said computer science professor Gudrun Socher, as reported by The Washington Post. For basic questions, search engines are usually faster and use much less power. A Google search uses around 10 times less energy than a ChatGPT prompt, according to Goldman Sachs.
Dauner agrees. “If users know the exact CO₂ cost of their AI-generated outputs, such as casually turning themselves into an action figure, they might be more selective and thoughtful about when and how they use these technologies.”
Experts emphasize choosing smaller models for simple tasks and using longer, more powerful ones only when necessary. Keeping prompts and answers concise also helps reduce energy usage. In the end, the choice isn’t just about speed or accuracy, it’s also about responsibility.