12 min read

Le Paradoxe de Jevons : Pourquoi l'IA Plus Efficace Consomme Plus d'Énergie

Google says text generation got 98% cleaner in a year. That sounds like the kind of progress story Silicon Valley loves: the same magic, almost none of the guilt. If you stop at that number, the future looks oddly manageable.

It probably is not.

Kate Crawford, the USC professor behind Atlas of AI, has been pointing to an older logic that cuts through the celebration. In the nineteenth century, William Stanley Jevons noticed that better coal efficiency did not reduce coal use. It increased it. Once steam power became cheaper, people found more things worth powering. The total appetite rose faster than the per-unit savings.

AI fits that pattern with alarming precision. Every gain in efficiency lowers the cost of asking for more model output, embedding it in more products, and normalizing it in more moments of daily life. The climate math is driven by totals, not press releases.

Efficiency changes price, and price changes behavior

Jevons' original point is often flattened into a slogan. The deeper idea is about elasticity. If a resource becomes cheaper to use, demand often expands. Sometimes a little. Sometimes a lot. It depends on whether there were latent uses waiting for a price drop.

With AI, there are many.

When inference gets cheaper, companies do not simply provide the same number of responses with lower energy. They redesign products around constant model invocation. Search results get rewritten into generated summaries. Email apps offer to draft replies you never asked for. Customer support starts every interaction with a chatbot. Office suites add assistants to documents, slides, and spreadsheets. Messaging apps suggest rewrites. Note-taking tools transcribe and summarize by default. Cameras add generative cleanup. Phones turn simple settings pages into conversational interfaces because apparently every menu now needs a muse.

This is the part that matters. The efficiency gain changes the business case, not just the engineering chart. A feature that was too expensive at scale last year becomes viable this year. Once it is viable, product managers stop asking whether it is necessary. They ask how broadly it can be deployed.

The result is a classic rebound effect, but with software speed and venture incentives. Coal took time to spread through factories, railroads, and homes. AI can spread through an app update.

The headline number is narrower than it looks

Google's reported 98% reduction is not meaningless. Improvements in chips, model architecture, scheduling, and quantization are real. Smaller models can often do useful work with far less electricity than earlier generations. Better serving stacks reduce waste. A response generated on newer hardware can be dramatically more efficient than the same task on older infrastructure.

But that headline lives inside a carefully bounded frame.

First, it concerns text generation. Text is important, but it is also among the lighter workloads in the generative universe. Image generation is hungrier. Video is hungrier still. Multimodal systems that ingest audio, images, long documents, and live context shift the energy picture again. A company can celebrate a steep drop in the emissions of one narrow task while its broader AI portfolio gets materially heavier.

Second, per-query efficiency says little about total system demand. If one response becomes ten times cheaper, firms rarely stop at the old volume. They increase usage until the new cost curve fills with fresh demand. A model that once handled a few premium features can now sit behind every keystroke.

Third, these announcements usually center operational emissions for inference in a defined setting. They rarely capture the whole chain with equal clarity: training runs, repeated fine-tuning, failed experiments, hardware manufacturing, accelerated refresh cycles, transmission upgrades, water use for cooling, and the local strain of concentrated data center growth. Those are not footnotes. They are part of the bill.

Researchers such as Sasha Luccioni and Emma Strubell have spent years making this point in adjacent forms: efficiency metrics are useful, but only if paired with deployment scale. A cleaner transaction can still sit inside a dirtier system.

AI demand is unusually elastic

A lot of technologies get more efficient without causing disaster. Refrigerators became more efficient, and households did not respond by buying forty fridges. There is rebound in many sectors, but the magnitude varies.

AI is different because the supply creates the demand.

Most people were not wandering around in 2022 thinking, "I wish my notes app would generate three paraphrases of my grocery list." They did not ask for a language model in every inbox, every search bar, every coding tool, every image editor, every classroom portal, every HR workflow. Those uses are being manufactured by platforms because the cost has fallen enough to make omnipresence look feasible.

This matters more than the models themselves. The environmental risk is not only that each large model uses energy. It is that the industry is systematically changing the baseline expectation of what software should do. Once generation is cheap enough, interfaces start calling the model even when the user would have happily clicked a button, read a paragraph, or typed a sentence unaided.

That is a behavioral transformation disguised as technical progress.

Crawford's warning lands here. The issue is not just faster chips or cleaner data centers. It is ubiquity. If AI becomes ambient infrastructure, demand stops being occasional and becomes continuous. Your photo app pings a model. Your productivity suite pings a model. Your browser pings a model. Enterprise software pings a model behind workflows nobody notices because procurement signed the contract and the feature shipped turned on.

The electricity meter notices.

Bigger models did not go away

There is a comforting story in the industry that smarter engineering will let us do more with less. Sometimes that is true. Distillation, sparse architectures, better caching, and quantization all help. But these savings coexist with another trend: the race toward larger, more capable, more multimodal systems and longer-running tasks.

The old benchmark for a model request was a text prompt and a short reply. That is not where the market is heading. The newer pattern includes long contexts, tool use, retrieval over large corpora, persistent memory, voice interaction, image analysis, code execution, and agentic loops that call the model repeatedly until a task is done. Even if each individual call gets cheaper, the task structure becomes denser.

Then there is video. Text generation is relatively lightweight because language compresses intent into sparse symbols. Video is a flood. To generate or transform moving images at high quality requires far more computation. If generative video becomes a standard product feature rather than a niche tool, efficiency gains in text will not rescue the aggregate picture.

Model size tells a similar story. The public conversation moved from billions of parameters into trillion-parameter territory with a speed that would be funny if it were not attached to power contracts. Mixture-of-experts architectures soften the cost per inference by activating only part of the network, but they also enable larger systems and more ambitious products. Savings at one layer often bankroll expansion at another.

This is why the phrase "AI is getting greener" can be deeply misleading. It can mean a narrow task is becoming cheaper at the same moment the total landscape becomes much larger.

Grids experience totals, not efficiencies

Electric utilities do not plan around elegant ratios. They plan around load.

That is where the AI boom becomes concrete. Some forecasts place AI-related electricity demand on a scale comparable to large national systems by the end of the decade. Estimates vary, and they should be treated cautiously, but the direction is clear. Data centers are becoming a defining growth sector for power demand after years in which electricity consumption in many regions was relatively stable.

The range often cited for AI-driven growth in electricity demand over the next decade is wide, roughly 30% to 90% depending on assumptions. The uncertainty is real. The broad point survives it. Even the conservative scenarios imply a massive buildout of generation, substations, transmission, cooling, and backup infrastructure.

Take Virginia, already one of the clearest cases. Data centers account for roughly a quarter of the state's electricity use, and projections for the coming years climb much higher. When that share rises, the impact is not abstract. Utilities seek new generation. Transmission corridors expand. Rate structures shift. Local communities face land use battles, water concerns, and diesel backup emissions. The cloud eventually turns into zoning meetings and contested power lines.

Or consider the scale of the campuses being discussed. A 10-gigawatt data center complex is not a metaphor. It is roughly the output of about ten large nuclear reactors, or a huge chunk of a regional grid. Build enough of those, and the question stops being whether an individual query is efficient. The question becomes what physical system must be built to sustain the aggregate load.

That load will not always be met by clean power. Companies like to emphasize power purchase agreements and renewable matching. Those mechanisms help, but they do not erase temporal and geographic mismatch. When a data center needs power at a given hour, the grid serves it from the available mix. In many regions, especially under rapid demand growth, that means gas stays online longer, new fossil generation gets justified, or decarbonization timelines slip.

Efficiency helps at the margin. Total demand sets the direction.

The rebound is social before it is electrical

There is a temptation to treat this as a hardware problem with a hardware fix. Better chips. Better cooling. Better power procurement. Those matter, but they miss the social mechanism that drives the rebound.

We are making generated output absurdly cheap, then filling the world with low-value reasons to generate more of it.

Some of that output is useful. Code assistance can save real labor. Accessibility features based on speech and text models can be genuinely important. Translation can widen access. Scientific tools may accelerate work that deserves the energy budget. This is where easy cynicism fails. The point is not that all AI use is frivolous.

The point is that the deployment logic does not distinguish very well between high-value and low-value uses. Once a model is available, everything starts looking like a candidate for generation. A mildly annoying support queue gets a chatbot. A perfectly functional search result gets a summary layer. A work app invents synthetic cheerfulness where a menu would have done the job. The cheapness of inference erodes restraint.

That erosion has a cultural side. It shifts expectations about what software owes us. Reading becomes something the machine should pre-digest. Writing becomes something it should pre-compose. Searching becomes something it should answer instead of expose. The labor saved is often tiny, but the calls accumulate by the billions.

In climate terms, this is a strange trade. We are spending scarce clean electricity and massive industrial effort to automate a growing volume of optional cognition. Some of those uses are profound. Many are decorative.

Technical progress still matters, but it cannot carry the whole burden

None of this means efficiency is pointless. If you are going to run large models, they should be as efficient as possible. Waste is still waste. A less energy-intensive model doing the same useful work is an improvement by any sensible standard.

There are also cases where efficiency can produce absolute reductions. If demand is fixed by a budget, by regulation, or by a narrow enterprise workflow, lower per-task energy use can cut total consumption. If a hospital runs a diagnostic model a set number of times, efficiency helps directly. If a company replaces an older inference stack without expanding the product surface, the climate arithmetic can improve.

The problem is that the mainstream AI market does not look like a capped hospital workflow. It looks like a competitive platform race where every cost drop gets converted into feature expansion, broader distribution, or lower pricing to attract more usage. The rebound is not an accident. It is the business model expressing itself.

That is why treating efficiency as the solution is so attractive. It avoids the harder question: which uses deserve the energy, materials, land, and water they consume? As long as the answer is "nearly all of them, because the marginal cost is falling," total demand will keep outrunning per-unit gains.

The real policy lever is deployment

If climate risk comes from total usage, then governance has to address total usage. That sounds obvious, yet most public discussion stays focused on model capability and chip supply.

A more serious approach would force visibility into the full energy profile of AI services across modalities, not just the easiest benchmark to improve. It would make default-on generative features harder to justify when their value is marginal. It would push companies to disclose when a product change increases background model calls across millions of users. It would also connect data center approvals to local grid realities instead of treating every new campus as if power will materialize from corporate optimism.

There is room for market discipline too. When AI features are free at the point of use, demand becomes artificially frictionless. Pricing, quotas, and carbon-aware scheduling can all shape behavior, though none is a silver bullet. The larger point is that deployment choices are not natural phenomena. They are policy choices, product choices, and governance choices.

That can sound less glamorous than another efficiency breakthrough. It is also where the actual climate outcome gets determined.

A cleaner query can still dirty the future

The seductive mistake is to think the problem shrinks with every technical gain. In AI, the opposite can happen. The moment a model becomes dramatically more efficient, the industry often interprets that as permission to put it everywhere.

Jevons understood the pattern before semiconductors, cloud regions, and trillion-parameter models existed. Cheaper access to a useful capability tends to widen the market for that capability. AI is useful enough, flexible enough, and aggressively distributed enough to make the rebound severe.

So yes, celebrate better chips, smarter inference, and lower emissions per text response. Then look at the total load being added to grids, the spread of generation into mundane software, and the strange ease with which optional features become default infrastructure. The climate question is no longer whether AI can be made more efficient. It is whether we are willing to deny some deployments the right to exist on the simple grounds that efficiency made them possible, not necessary.

End of entry.

Published April 2026