Headroom

Codex usage limits and the 5-hour window

If you keep seeing the "you've reached your usage limit" message in Codex, you are not alone. The way OpenAI meters Codex on a ChatGPT subscription is different from how a typical API bill works, and the constraints can show up in places you do not expect.

Here is how the limits actually work, what each plan covers, and how to keep coding without immediately upgrading.

How the 5-hour rolling window works

Codex meters subscription usage on a 5-hour rolling window. Once you start sending messages, a session begins and continues for five hours. Inside that window, every prompt, file read, tool call, and model response counts toward your allowance. When you hit the cap, you are blocked from sending more messages until the window rolls over.

There is also a separate weekly cap that prevents very heavy users from running constantly. Most people hit the 5-hour limit first.

Since April 2026, Plus, Pro, and Business plans meter Codex by tokens rather than message count, so heavier requests now drain the window faster. If you run out before it resets, OpenAI lets you buy additional credits or switch to a smaller model to keep going — both useful, but one costs money and the other costs capability.

What each plan tier covers

  • Free and Go include only limited Codex access — enough for the occasional prompt, but not sustained coding work.
  • Plus is the entry tier for real Codex work and covers light to moderate usage — short coding sessions, a few hours of focused work per day. Heavy debugging or codebase exploration will hit the limit quickly.
  • Pro raises your allowance substantially and comes in x5 and x20 usage levels, so you can match it to how much you code. Comfortable for full-time individual developers who use Codex most of their workday.
  • Business and Enterprise are aimed at teams running long, context-heavy sessions across multiple projects, with the highest allowances and pooled seats.

The right tier depends less on your seniority and more on your workflow style. Someone who runs many short, narrow prompts uses far less than someone who does long debugging sessions with lots of file reads.

Why you hit limits sooner than expected

Most usage gets eaten by content you did not write yourself: tool output, file content Codex reads on its own, repeated conversation history, and large search results. A single failing build or a few large file reads can use as much of your budget as a long thoughtful prompt.

For a fuller breakdown of what burns usage fastest, read our Codex usage guide. If you want to understand why the bill (rather than the limit) feels high, our Why is Codex so expensive? page covers that angle.

Stretch your plan instead of upgrading

The fastest way to avoid the next limit message is to reduce how much context Codex sends per turn. Headroom intercepts your prompts locally, compresses repetitive logs and boilerplate, and forwards a leaner version to the model. Same workflow, ~50% fewer tokens, your plan effectively lasts twice as long. No account migration, no measurable quality loss.

For the full set of tools and tactics, see our Codex cost guide.

Make your current plan last longer

Install Headroom, run one of your typical Codex sessions, and compare how far you get before the 5-hour limit kicks in.