How do Claude Code usage limits work?

Claude Code meters usage on a 5-hour rolling window that starts with your first message, with a separate weekly cap on top for sustained heavy use. When you hit the cap you are blocked until the window rolls over.

What counts toward the Claude Code usage limit?

Everything in a turn counts: your prompts, the files Claude Code reads, tool and terminal output, and the model's responses. Machine-generated context like logs and search results usually consumes far more of the window than your own prompts.

How do I stop hitting Claude Code limits without upgrading?

Reduce how much context Claude Code sends per turn. Headroom compresses repetitive logs and boilerplate locally before they reach the model, cutting token usage by about 50% so the same plan effectively lasts twice as long.

Claude Code Usage Limits: Hit the 5-Hour Limit?

How the 5-hour rolling window works

Claude Code measures usage on a 5-hour rolling window. Once you start sending messages, a session begins and continues for five hours. Inside that window, every prompt, file read, tool call, and model response counts toward your allowance. When you hit the cap, you are blocked from sending more messages until the window rolls over.

There is also a separate weekly cap that prevents very heavy users from running constantly. Most people hit the 5-hour limit first.

One Claude-specific wrinkle works in your favour: Anthropic's prompt cache bills a stable conversation prefix at a fraction of the normal input rate on repeat turns, so replayed history costs less than the raw token count suggests. Compression stacks on top of that — a smaller prefix is cheaper to carry whether or not it is cached.

What each plan tier covers

Plan	Best for	Relative capacity	You'll hit the limit when…
Pro	Light to moderate use — short sessions, a few focused hours a day.	Entry allowance.	Heavy debugging or codebase exploration.
Max x5	Full-time individual developers using Claude Code most of the workday.	~5× Pro.	Long, multi-project days.
Max x20	Long, context-heavy sessions across multiple projects.	Highest tier — also the biggest price jump.	Rarely, for most workflows.

The right tier depends less on your seniority and more on your workflow style. Someone who runs many short, narrow prompts uses far less than someone who does long debugging sessions with lots of file reads.

Why you hit limits sooner than expected

Most usage gets eaten by content you did not write yourself: tool output, file content Claude Code reads on its own, repeated conversation history, and large search results. A single failing build or a few large file reads can use as much of your budget as a long thoughtful prompt.

For a fuller breakdown of what burns usage fastest, read our Claude Code usage guide. If you want to understand why the bill (rather than the limit) feels high, our Why is Claude Code so expensive? page covers that angle.

Stretch your plan instead of upgrading

The fastest way to avoid the next limit message is to reduce how much context Claude Code sends per turn. Headroom intercepts your prompts locally, compresses repetitive logs and boilerplate, and forwards a leaner version to the model. Same workflow, ~50% fewer tokens, your plan effectively lasts twice as long. No account migration, no measurable quality loss.

For the full set of tools and tactics, see our Claude Code cost guide.

Claude Code usage limits and the 5-hour window

How the 5-hour rolling window works

What each plan tier covers

Why you hit limits sooner than expected

Stretch your plan instead of upgrading

Make your current plan last longer

Not ready to install yet?

Headroom is macOS-only — for now

You're on the list.