Question 1

What is a token?

Accepted Answer

Large language models don't read characters or words — they read tokens, which are common chunks of text (a short word, part of a word, or a piece of punctuation). Roughly one token is about 4 characters of English, but it varies by model and language. Pricing and context limits are all measured in tokens, which is why counting them matters.

Question 2

Which models give an exact count?

Accepted Answer

OpenAI models are exact. GPT-4o, GPT-4.1 and the o-series use the o200k_base tokenizer; GPT-4, GPT-4 Turbo and GPT-3.5 use cl100k_base. Both are the real tokenizers OpenAI publishes, bundled here and run entirely in your browser, so the numbers match the API.

Question 3

Why are Claude and Gemini shown as an estimate?

Accepted Answer

Anthropic and Google do not publish an offline tokenizer — the only way to get an exact count is to call their API with a key. To keep this tool fully private and client-side, Claude and Gemini counts are estimated from the o200k tokenizer scaled by a per-model factor (Claude tends to run ~15–20% denser than tiktoken, Gemini is close to o200k). Treat them as a close guide, not a billing-exact figure.

Question 4

Is my text private?

Accepted Answer

Yes. Every count happens locally in JavaScript. Nothing is uploaded, logged or stored — including for the estimated models, which never contact any API.

Question 5

What is the token visualization?

Accepted Answer

For the exact OpenAI models, each token is shown as a separate highlighted chip so you can see exactly how your text is split. Spaces are shown as · and line breaks as ↵ so token boundaries are easy to read.

AI Token Counter

Token breakdown

Why count tokens?

How to use it

Exact vs. estimated

Tokens, characters and words

Frequently asked questions

What is a token?

Which models give an exact count?

Why are Claude and Gemini shown as an estimate?

Is my text private?

What is the token visualization?