OpenAI gpt-5-codex

Today I experienced another “Jagged AGI Moment” with gpt-5-codex.

Published: 2025-09-17

Today I experienced another “Jagged AGI Moment” with gpt-5-codex!

I build a Literature Screening Assistant using the Codex CLI in a single shot and used the low-cost gpt-oss-120b to screen papers one by one against inclusion/exclusion criteira. The results look promising!

Background: I am running a traditional literature review. I collected over 1,000 papers and needed to screen them. I tried Rayyan first and use Comet to manually screen 50 papers to enable its AI screening feature, but it did not help. I also tried ASReview, but it looks complex.

Last night I talked with a friend who suggeted Vibe Coding a tool. In the shower I mapped what I needed. This moring, I crafted my PRD, then fed it to Codex. Surpringly, in 30 mintues, without adding a new conversation, I had a usable prototype.

I tested it this afternoon and deployed it online. It helped me narrow 1,426 papers to 128. The total cost including my test was under $1! This was also the first time I enjoyed a smooth coding experience without too many arguments with the AI.

What excited me is not the tool itself but the future it signals. If no tool matches my needs, I can build one easily. That feels like exponential capability! Bro, it’s just September 2025! Pro tips: trying gpt-5-codex! Try the tool here:

https://literature-screening.vercel.app

On my way to support my friend’s half marathon, Codex helped me integrate with the free Grok-4-fast model remotely. You can use this screening tool for free right now!

Thanks to @grok 4 fast, I could run multiple rounds to validate the different results for free~!

Showing Codex like CLI tools to 8 of my friends this weekend. Over half still didn't get it. Some said it wasn't until they saw how I used it that they could realize it was an army of intelligence. (Not for the coding but to finish the assignments in a few shots)

I am reading several papers that screen from my tool. While AI can help me finish the task in one shot, I’d like to read and annotate by myself and then compare my notes with AI’s.

Claude Sonnet 4.5 looks failed for me (At least few-shots tries). I ran the same prompt in Claude Code and Cursor with Sonnet 4.5, and neither worked. In contrast, gpt-5-codex-medium produced a usable prototype in just a few prompt.