I tested GPT-4o image generation on 5 energy-sector use cases
These are the results, unfiltered
GPT-4o image generation was released last week and quickly went viral, thanks to its ability to accurately edit existing images and replicate artistic style, including Studio Ghibli’s signature aesthetic.
ChatGPT gained 1 million new users in just 1 hour on Monday as the world became “Ghiblified”.
Setting aside the art plagiarism discussion, which I won’t address here, I believe this marks a breakthrough that will permanently change how humanity interacts with visual content. The ability to realistically edit existing images and iteratively improve results through a chat interface marks all of the difference with previous versions. Before last week, image generation with ChatGPT was very dull and unusable in any practical sense. Now it just entered the realm of tools that can be useful in everyday personal and professional settings.
Another exciting (though almost unnoticed) release last week was Google’s Gemini 2.5. This model currently tops all benchmarks and might very well be the most intelligent model available to date. It is temporarily accessible for free on Google AI Studio. I recommend you check it out.
Amidst all the hype, I thought it would be fun to test both models, so I asked Gemini 2.5 to generate a few use-cases to test GPT-4o image generation in the “buildings, energy, and environment” field. Some of them were not that interesting or relevant, among the rest, I decided to test the top five:
Renewable Energy Project Site Conceptualization: Generate realistic concept images for proposed renewable energy projects (wind, solar, storage) in specific environments, incorporating multiple elements accurately.
Illustrating Building Energy Efficiency Retrofit Scenarios: Upload an image of an existing building (house, office). Prompt GPT-4o to generate "after" images showing the visual impact of specific energy efficiency upgrades.
Generating Annotated Diagrams of Complex Energy Systems: Create clear, visually appealing diagrams explaining complex energy systems or processes for educational or communication purposes.
Visualizing Energy Audit Recommendations: After a conceptual energy audit, use GPT-4o to create visuals for the report.
Component Identification and Installation Guides: Upload a picture of an energy system component (e.g., a specific type of solar inverter, a smart meter). Ask GPT-4o to generate a simplified diagram identifying key ports or indicators with labels, or create a visual showing the basic steps for mounting it.
1. Renewable Energy Project Site Conceptualization
This use-case is especially interesting: a few months ago we tested the ability of Google's Solar API to detect the area of a rooftop suitable for solar panels installation. I decided to try this on same roof that we analysed in our former article: a building in Washington D.C.
Prompt: Edit this image highlighting the areas of the rooftop that are suitable for the installation of PV panels.
Output
Prompt: Can you show how the rooftop would look like with panels installed in those areas?
Output
The result is not great: the initial image was cropped, so some of the roof is not visible anymore. The highlighted areas in the first image are all on sections of the roof, but some areas are left out without an apparent reason. Most importantly, the panels in the second image do not correspond to the areas marked in the first image, so it looks like the model is not able to accurately detect the areas that it itself marked.
The result is not that different from the one we get using Google Solar, but I wouldn’t trust using this in a professional setting.
2. Illustrating Building Energy Efficiency Retrofit Scenarios
I thought it would be interesting to test this with an image from a real building. I have some old fixtures at home that are not that well insulated, so I decided to ask ChatGPT how a retrofit would look like.
Prompt: I'm writing a building energy retrofit proposal plan for a residential unit. Edit this image to show how it would look if triple glazing windows were installed.
Output
That’s quite good. I go on to ask for solar shading as well.
Prompt: Now add solar shading as well.
That’s not bad, although that’s not the type of shading that I was thinking about.
Prompt: Change the solar shading you added with an outside awning.
Pretty good, I also want to ask it to add some notes on the image showing the different elements that were added plus the estimated cost range.
Prompt: Now mark the improvements (triple glazing windows and solar shading) on the image with a colored pencil. Include the price range for both. Make the image brighter.
It didn’t nail the labels, although maybe they could be fixed with more specific prompting.
The final result in this case is quite impressive. I believe this could be potentially used in a professional setting.
3. Generating Annotated Diagrams of Complex Energy Systems
Here I wanted to test the creation of an image with educational purposes: a diagram showing energy flows for a house equipped with solar and batteries.
Prompt: Generate a diagram of a residential grid-tied solar PV system with battery storage. Include: solar panels, inverter, battery pack, smart meter, connection to the grid, and main electrical panel. Show arrows indicating energy flow during daytime charging and nighttime discharging.
That’s a decent start. If I wanted to use this in the newsletter, it would need to match my colors and branding though. Let's see if it can do this.
Prompt: Edit the image to use the colors of my brand.
Output
Now the colors are correct, but the flows are not. I might ask it to generate just the icons, and then draw the arrows and text on top myself, but then there’s not that much of an advantage over doing the design directly on Canva. My conclusion here is that the model is not yet able to understand the logic behind the image: the sun should be hitting the panels and not the inverter, the switchboard should be connected to the smart meter, etc. For this use-case, I don’t think we’re ready for professional or educational use.
4. Visualizing Energy Audit Recommendations
I decided to use here a typical suggestion that might be included in an energy performance certificate.
Prompt: I'm drafting an energy audit for a commercial building. After inspecting the building, I identified several measures that can be taken to improve the energy efficiency of the building. I will send you the description of one of them. Create an image that I can add to the audit report to better explain the proposed action.
EEM: Internal Insulation
Description: This measure involves installing internal thermal insulation, specifically using 9cm thick calcium silicate panels finished with plasterboard, on the walls of the office area. The goal is to improve the thermal performance of the building envelope and reduce heat loss.
Output
Prompt: Make it photorealistic.
Output
There are a few issues with the sizes and distances, but apart from that the result is quite good. I think this could be used for professional or educational use.
5. Component Identification and Installation Guides
For this use-case I sent the image of a common house boiler and asked it to identify all the components.
Prompt: Mark the main components of this boiler directly on the image.
Output
Not great. It correctly identifies some of the components, but many of the labels are off or pointing in random places. Probably not usable in an educational or professional setting for now, as it’s mostly hallucinating.
Conclusion
This was an interesting exercise to understand better what GPT-4o image generation is capable and incapable of doing. What stood out for me is its ability to add elements to an existing image, like in the second test case. It also appeared clear that the model is struggling with understanding the logic of the images.
From a content creation perspective, the possibility to quickly create images that follow my brand guidelines looks promising. Although the occasional hallucinations and not being to able expect a “correct” result can make the whole process slower.
If we consider how much ChatGPT improved from its release, I think it’s just fair to expect image generation to improve as well, and that we’ll be using these tools more and more in our everyday lives. Many sectors have just been changed forever. Resilience and adaptability might very well be the most valuable skills to navigate this era.
🚀