How the 5-hour rolling window works
Codex meters subscription usage on a 5-hour rolling window. Once you start sending messages, a session begins and continues for five hours. Inside that window, every prompt, file read, tool call, and model response counts toward your allowance. When you hit the cap, you are blocked from sending more messages until the window rolls over.
There is also a separate weekly cap that prevents very heavy users from running constantly. Most people hit the 5-hour limit first.
Since April 2026, Plus, Pro, and Business plans meter Codex by tokens rather than message count, so heavier requests now drain the window faster. If you run out before it resets, OpenAI lets you buy additional credits or switch to a smaller model to keep going — both useful, but one costs money and the other costs capability.
What each plan tier covers
- Free and Go include only limited Codex access — enough for the occasional prompt, but not sustained coding work.
- Plus is the entry tier for real Codex work and covers light to moderate usage — short coding sessions, a few hours of focused work per day. Heavy debugging or codebase exploration will hit the limit quickly.
- Pro raises your allowance substantially and comes in x5 and x20 usage levels, so you can match it to how much you code. Comfortable for full-time individual developers who use Codex most of their workday.
- Business and Enterprise are aimed at teams running long, context-heavy sessions across multiple projects, with the highest allowances and pooled seats.
The right tier depends less on your seniority and more on your workflow style. Someone who runs many short, narrow prompts uses far less than someone who does long debugging sessions with lots of file reads.
Why you hit limits sooner than expected
Most usage gets eaten by content you did not write yourself: tool output, file content Codex reads on its own, repeated conversation history, and large search results. A single failing build or a few large file reads can use as much of your budget as a long thoughtful prompt.
For a fuller breakdown of what burns usage fastest, read our Codex usage guide. If you want to understand why the bill (rather than the limit) feels high, our Why is Codex so expensive? page covers that angle.
Stretch your plan instead of upgrading
The fastest way to avoid the next limit message is to reduce how much context Codex sends per turn. Headroom intercepts your prompts locally, compresses repetitive logs and boilerplate, and forwards a leaner version to the model. Same workflow, ~50% fewer tokens, your plan effectively lasts twice as long. No account migration, no measurable quality loss.
For the full set of tools and tactics, see our Codex cost guide.