Headroom

How to install Headroom for Claude Code

Headroom cuts Claude Code token usage by around 50%, so the plan you already pay for lasts about twice as long. There are two ways to get it running: install the open-source Headroom CLI yourself, or install the Headroom app and let it run in the background. Both use the same compression engine — the app is built directly on the open-source CLI, with the maintainer's endorsement.

This guide walks through both paths, shows what running the open-source CLI asks of you day to day, and helps you pick the right one.

The two ways to run Headroom

The open-source Headroom CLI (chopratejas/headroom) is a powerful context-compression layer you install and operate from the terminal. The Headroom app wraps that same engine in a one-click installer and keeps it running for you. Same savings — a very different amount of setup.

Installing the open-source Headroom CLI

If you are comfortable on the command line, the CLI is free and flexible. On a clean machine, a typical setup for Claude Code looks like this:

  1. Install the package into a Python 3.10+ environment: pip install "headroom-ai[all]" (Node and Docker images are also available).
  2. Choose how it runs with Claude Code — for example headroom wrap claude, or start the drop-in proxy with headroom proxy --port 8787, or register it as an MCP server with headroom mcp install.
  3. Keep that process alive for the whole session, and repeat the wrap or proxy step every time you open a new terminal.

On a well-behaved machine that is genuinely all it takes, and plenty of developers are happy here.

What you have to do yourself

The commands above are the easy part. Running the CLI yourself means a handful of standing chores that are entirely on you:

  • Remember to start it. Close the terminal and the proxy stops — your next Claude Code session runs at full token cost until you start it back up.
  • Remember to update it. There is no auto-update. You run pip install -U headroom-ai yourself to pick up new compression improvements and fixes, and it is easy to quietly drift a few versions behind.
  • Keep your Python setup happy. It needs Python 3.10+, so it can clash with another project's version or virtualenv, and a headroom: command not found after install is a common PATH snag. What works in one shell may not in the next.
  • Check that it is actually working. On its own there is no dashboard, so it is hard to tell at a glance whether it is running or how many tokens you have saved.

On locked-down or corporate networks it gets harder still — SSL inspection can fail the install and the model assets it downloads may be blocked — but most individual developers never hit that. The standing chores above are what you actually feel. None of this is a knock on the project; it is an excellent open-source tool, and this is simply the cost of running the infrastructure yourself.

Installing the Headroom app

The app removes every step above. There is no Python, no pip, and no proxy to start:

  1. Download the macOS app.
  2. Drag it into your Applications folder.
  3. Open it once — it sets itself up and lives in your menu bar.

That is the whole installation. No terminal, no dependency resolution, nothing to keep alive.

It runs in the background, continuously

This is the real difference. The CLI only compresses while you remember to run it. The Headroom app runs continuously as a menu bar app: once it is installed, every Claude Code session is optimized automatically, in the background, without you starting a command. Restart your Mac and it is still there. You get the ~50% savings on the first session and every session after, with nothing to re-enable.

What you get with the app

  • It updates itself. The app pulls new versions in the background — including updates to the Headroom engine it runs — so you are always on the latest without typing a single command.
  • Two tools, one installer. The app bundles Headroom and RTK together, so noisy terminal output gets trimmed alongside prompt bloat.
  • A savings dashboard. See tokens saved over time instead of guessing.
  • Local and private. Optimization runs on your machine, so your prompts do not need to be shipped to a Headroom server.
  • A free tier. Install it, run a task you already do, and compare the token count before and after.

The net effect is the same as the CLI — around 50% fewer tokens, roughly 2x the usage on the Claude plan you already pay for. Our benchmarks show where the savings come from.

Which should you choose?

Choose the open-source CLI if you want maximum control, run on Linux or Windows, or want to script Headroom into your own pipeline and do not mind the setup. Choose the Headroom app if you are on macOS and want zero-setup, always-on savings you never have to think about. Both are honest paths to the same result; the app simply removes the parts people get stuck on.

For the wider set of tactics, see our Claude Code cost guide. The FAQ covers privacy, the prompt cache, and how the app relates to the open-source CLI.

Skip the setup

Download the Headroom app, open it once, and let it cut your Claude Code tokens in the background from the very next session.