Using A.I. for Research, Reports, Plans, etc.

What an A.I. creates in comparison to paying a consulting firm

Mar 19, 2025

Note: This is written in March 2025. All comments about the capabilities of each A.I. will be obsolete in 3 - 6 months as they keep improving each A.I. with gigantic improvements. Today’s strongest A.I. is next month’s weakest.

If I was Ascend Analytics1 or any similar company, I would be trying to figure out what I need to change to stay in business. Because the below would require 2 hours tops by a good analyst, with the assistance of an A.I., to create. It delivers something as good, arguably better.

I’m using the Colorado Clean by 2040 Report because it’s something I understand well enough to write a good prompt for and to review the generated reports for errors. This is not as simple as “write me a Colorado clean energy report.” The A.I. is a co-worker and your guidance is key.

Six months ago what you will see here would not have been possible. Back then the A.I. was a LLM which was basically a very powerful means to string the appropriate sentences together. That was useful but limited in its abilities.

The A.I. programs now include reasoning2 which is a giant step forward. In many ways what the A.I. produces is superior to what a person can do because it has a much wider view of the problem space. No individual, or even team of individuals, can have the breadth and depth of knowledge an A.I. has.

That does not mean it’s perfect. Like a human it can have errors of omission. And like a human, it can get things wrong. It’s rare to see hallucinations in a request like this. But it can make dumb mistakes such as list the retail cost of solar cells for a solar farm (too high) or list the cost in China (about ½ the cost in the U.S.).

Fundamentally you are getting something of accuracy that you will get from a professional research company. But you’re getting it for the cost of 2 hours time. Which means you can perform in depth research on pretty much any topic inhouse, quickly.

A.I. Generated Colorado Energy Report

I added two criteria that Ascend did not and that has a major impact on the generated reports. First I told it to create 1 model that was focused on price and reliability and did not need to reduce CO2. I added this not because I think it’s an avenue we should follow, but to give us the cost of this approach as a benchmark.

The second criteria was to not use any technology that is not presently commercially available.3 This was a general prohibition but the result was that no models use SMRs or H2 and only some considered Geothermal.

It’s arguable as to Geothermal being considered commercial today. It’s a judgement call. I think both including and excluding Geothermal is a reasonable decision. In this case you can both ask why Geothermal was/not included and/or re-run the prompt explicitly telling to include/exclude Geothermal. This is a tool under your guidance.

Click here to read the prompt.

ChatGPT

At present ChatGPT is the weakest of all the A.I.s. Basically it’s 2 generations behind the others and the level of reasoning capability in it is minimal. Click here for the full report. Its models are:

Coal phase out replaced primarily with gas, and some wind/solar.
Renewable dominated with wind, solar, & batteries replacing coal.
Nuclear, wind, & solar. This estimates Nuclear will require 50-100 miles of new transmission lines - which is not likely.4
Regional connections for backup, wind, & solar.
Distributed generation & demand response.

Gemini

Gemini performs the deepest research. Expect it to give you what an experienced researcher who follows every avenue will give you.5 Its models are:

Replace coal with gas.
Wind, solar, & batteries with gas backup.
Wind, solar, & batteries - no gas.
Nuclear, wind, & solar. This uses SMRs even though SMRs are not presently available.

Grok

Grok is the most aggressive A.I. Expect it to give you solutions that push the limits of what is possible.6 Its models are:

Replace coal with gas. No additional wind or solar.
Wind & solar with gas backup.
Regional connections, add lots of wind, some solar.
Lots of solar, some wind & batteries.
Nuclear, wind, & solar. This uses SMRs even though SMRs are not presently available.

Perplexity

Perplexity delivers what I view as the lightest solution. It’s great for writing a blog entry but not as good for deep research like this question. I tend to use Perplexity as I work on improving my prompts because it needs the most guidance and so it’s great for improving a prompt. Its models are:

Replace coal with gas, wind, & solar.
Wind & solar with gas backup.
Everything bagel - wind, solar, geothermal, H2, & batteries.
Distributed energy resources.
Added transmission lines, wind, & solar.

Qwen

Qwen is a middle of the road A.I. It does great if given clear guidance. But when told to make fundamental decisions itself, it is not great. Its models are:

Coal phase out replaced primarily with gas, and some wind/solar.
Renewable dominated with wind, solar, & batteries replacing coal.
Nuclear, wind, & solar. Note that even though the prompt said no SMRs, this incorporates SMRs. As I said, Qwen is weak when given general guidance.
Regional connections for backup, wind, & solar.
It has a one paragraph lumping in H2, Geothermal, & demand-side management as three more models, with no details. As with SMRs, H2 is definitely not commercial.7

The Report Contents

If you go read them you will find a report similar to the one from Ascend. They walk through the present state of our grid and what we need to pay attention to as we build it for the needs of tomorrow.

For each model it details the model and why that model was selected. It lists out what needs to be built & when, the CAPEX & OPEX, and the reliability, then finishing up with the advantages & disadvantages.

You will find citations in each for the CAPEX & OPEX costs. Just as any valid report will show, the costs are in a range because these are not costs that can be predicted to the dollar.

And each concludes with their recommendation and why they recommend that model.

Follow-up questions

I asked Grok why it specified SMRs. The answer was long but the key argument was:8

I suggested SMRs for their potential to solve Colorado’s reliability and emissions challenges with a proven design (NuScale), assuming commercial readiness by the early part of the 2025–2040 period. However, I acknowledge they aren’t shipping today, and my interpretation stretched the guideline slightly based on NRC approval and industry momentum.

I asked Qwen why it included H2. It apologized and re-wrote that part:

In my initial draft, I considered hydrogen as part of a hybrid system because it has been widely discussed in academic and policy circles as a potential solution for decarbonizing sectors that are difficult to electrify (e.g., heavy industry, aviation, and long-duration energy storage). However, upon reflection, I realize that including hydrogen in this analysis is premature, as electrolyzer systems for producing green hydrogen are still in pilot or early commercial stages, and the infrastructure for transporting, storing, and utilizing hydrogen at scale does not yet exist.

Conclusion

The first thig to take away from this is the prompt is everything. I could have explicitly said don’t consider H2 but I was curious to see what the A.I.s did with the more general limitation. I also could have said “if you select nuclear, use the APR-1400 plan” but again, I was curious to see what it did with the “must be commercial.”

These A.I.s have opinions. Some strongly held (Grok with SMRs), some weakly held (Qwen with H2). They have biases. For the cheapest ChatGPT had significant wind & solar while Grok was pure gas.

If I was a state employee tasked with generating the state energy roadmap this way, I would save off the above Grok report to pass on. But I would then revise the prompt some more getting more specific in the prompt.

I would tell it explicitly to create the following models, based on what was created above:9

Replace coal with gas.
Primarily wind & solar, use batteries & transmission lines for backup, gas for 5%.
Just renewables & batteries, no gas. (Just to prove how impossibly expensive this is.)
Assume geothermal is commercially available, build an optimal geothermal centric model.
Use APR-1400 nuclear plants for baseload, build an optimal model around this.
Microgrids with the grid as backup. Detail out how this will work.
Provide 2 or more additional models.

I would then run this revised prompt through Grok & Gemini, selecting the one that I think worked it through the best. And save off this second report to pass on with the first Grok report. Here it is run through Grok.

Basically figuring out prompts is go wide, then focus where you need to. At times I’ll even add possibly unnecessary items to the focus in prompt such as explicitly say no H2. Fundamentally you’re in a negotiation with the A.I. They’re your co-worker.

And keep in mind the costs in these reports are estimates. Same with a report from a company like Ascend. I’ve been in large arguments on reddit discussing the cost to build an APR-1400 and arguments about the cost of solar & batteries in 5 years. Those costs are fundamental to the approach we take and yet we don’t know the answers, just a probable range. Because it came from A.I. makes it no more certain.

As to is this better than hiring someone like Ascend?10 I think so. First off you can now get deeply researched studies quickly answering any question your curiosity takes you. Second, you can play around with criteria, limitations, etc. to see what that then gives you. Third, as you get good at this you’ll learn how to discuss with the A.I. what different criteria leads to, why it does, and what should be investigated. Fourth, I think it will deliver better reports. Research that pulls from a broader and more detailed knowledge of the pertinent information.

Keep in mind that this provides information. It is still on you to think that all through to make a decision. For example, in this case, a fundamental question is how much and how long to build a nuclear plant. No one knows the answer for sure. A range of 5 - 10 years and $5B - $15B means there’s a giant judgement call here. The A.I. is no better at this than an individual is.

This is a game changer. And we’re all at step 1 in learning how best to use A.I.11

The author of the Colorado Clean by 2040.

The intelligence explosion

The CEO is welcome to run this without those prohibitions but I wanted to deliver something realistic.

The smart approach is build the nuclear plants next to the coal plants and move the existing transmission lines over to the new plant.

If you want to overload people with exhaustive details, Gemini is for you.

It tends to be my favorite.

I doubt it every will be.

As I said, the most aggressive A.I.

Be careful not to favor your preferences. I could easily have written the prompt so that nuclear was the suggested model, and done so in a way that was not obvious. Don’t do that.

For reports based on existing knowledge. For research into something new, A.I. can be of gigantic help there too, but in a different way.

If anyone does this for state water policy, please share with me both your prompts and the reports you generated. I’m very curious to see what it comes up with.

John McKiernan

Mar 25

And as you use the AI options, it may be worth noting what other costs are a part of the package. This morning, one of the privacy enthusiasts on Daily Kos pointed to

Thomas Claburn, at The Register (UK), warns us about the inescapable devouring/dissolution: “You know that generative AI browser assistant extension is probably beaming everything to the cloud, right?”

https://www.theregister.com/2025/03/25/generative_ai_browser_extensions_privacy/

He shared:

The researchers' main findings are:

ChatGPT for Google and Wiseone store context across page navigation.

Two browser assistants, namely Harpa and Copilot, collect the full DOMs of user-visited pages. Others collect varying levels of private data from webpages.

Harpa and MaxAI share page locations and referrers with third-party tracking services.

Merlin was found to collect the contents of web forms, such as social security numbers entered into financial websites.

"Overall, we observed Perplexity to be the most privacy-friendly while extensions such as Harpa, MaxAI, and Merlin were amongst the least," the researchers said.

Expand full comment

Liberal and Loving It

Discussion about this post