My Thoughts on GPT-5 in the First Week
GPT-5 was released a week ago, and here are some of my thoughts (from my X/Twitter). Before release Ideal: GPT-4 -> GPT-5.
Published: 2025-08-14
In short, it is the best model I have ever used, especially GPT-5 Thinking and gpt-5-high. If you are a ChatGPT Plus subscriber, use GPT-5 Thinking as your default model.

Before release
Ideal: GPT-4 -> GPT-5?
Realistic: GPT-4 -> GPT-4 Turbo -> GPT-4o/4o-mini -> o1-preview/o1-mini -> o1 -> o1-Pro -> o3-mini -> GPT-4.5 -> o3/o4-mini -> GPT-4.1/GPT-4.1-mini/GPT-4.1-nano -> o3-Pro -> gpt-oss-120b/20b -> GPT-5
(By the way, OpenAI's evlaution has jumped from $20B to $500B since November 2022.)
(So what's the roadmap from GPT-5 to GPT-6? 🤣 🤣 🤣 )
(And it's just only 2/3 through 2025!)
My first impressions of GPT-5
1. Its coding skills seem comparable to Sonnet, though I didn’t notice a big improvement compared to Sonnet (you still need detailed prompts and guidance).
2. For most general users who rely on the default model, the leap is huge. The jump from GPT-4o to GPT-5 is significant!
3. Voice mode feels more natural, and it finally supports GPTs.
4. The price and speed are impressive, and its greater accessibility is a major win.
In short, GPT-5 is a fantastic model with the right harness; and I believe we will see it fundamentally change products.
Trip itinerary comparison
I’m planning a day trip to Niagara Falls. I tested the same prompt in ChatGPT (GPT-5), Claude (Opus 4.1), Gemini (2.5 Pro Deep Research), and Perplexity (Lab).
In my tests, GPT-5 “Think longer” performed best, followed by GPT-5 Thinking. You can see the results below.

- GPT-5 with “Think longer”:
- GPT-5 Thinking:
- Perplexity (Lab):
- Claude (Opus 4.1):
- Gemini (2.5 Pro Deep Research):
Pro tip for using GPT-5 in ChatGPT: choose GPT-5 (the default), click “+,” and select “Think longer.” In my experience, it outperforms GPT-5 Thinking.

GPT-5 Version
OMG, only 7% of ChatGPT plus users have tried reasoning models? Way lower than I expected!

This reinforced my sense: outside of the Bay Area are closer to the real world, where distribution matters and real people experience tech’s impact.
--
Update with yesterday's post: “Think longer” and “GPT-5 Thinking” mean the same thing. (Confirmed by Dylan Hunn mts @ openai)
1. GPT-5 Thinking = “Think longer” = gpt-5 with medium reasoning effort. (Count model limit)
2. Normal GPT-5 with automatic thinking = gpt-5 with low reasoning effort. (don’t count model limit)
3. Normal GPT-5 = gpt-5 without thinking
Pro Tip: Using GPT-5 Thinking by default.
--
This makes it harder to explain which model we use to non-technical people. We were promised an all-in-one model, yet it turned out like this. It reminds me of when I first heard about “wireless charging.” 😂 😂 😂
Also, this is the power of default settings...

Prompting GPT-5 works differently
Prompting GPT-5 works differently. A friend asked me this afternoon why the previous prompt performed poorly after using GPT-5 .
I noticed the same issue with my 3D Avatar Icon Marker and IELTS Speaking Simulator. Performance can decline if you do not refine the prompt. Even when you select previous model like 4o, the result may differ from before.
Here are my ways to fix it:
1. Add Preambles that guide the model to present a clear, upfront plan. For example:
BEGIN WITH A CONCISE CHECKLIST (3–7 BULLETS) OF WHAT YOU WILL DO; KEEP ITEMS CONCEPTUAL, NOT IMPLEMENTATION-LEVEL.
2. After writing your prompt, run it through OpenAI’s prompt optimizer tool for GPT-5 to improve it based on the best practice.
For more techniques, you can see GPT-5 prompting guide.
P.S. Although these tools and documents help, I still hope future update will notify builders in advance or at least maintain the same defaults. 😂