AI

Token

By Paul Brock·Updated on 22-04-2026
TL;DR

A token is the base unit an LLM processes text in — typically a sub-word fragment, roughly 4 characters or 0.75 words in English.

LLMs don't work on words or characters but on tokens — sub-word units created by a tokenizer. The word 'volatility' might split into ['vola', 'til', 'ity']. Per language this varies: English ~1.33 tokens/word, Dutch ~1.5-2, Chinese ~1.3 tokens per character. LLM costs are typically priced per million input/output tokens.

Example

A 10,000-word English report is ~13,300 tokens. With Claude Sonnet ($3/1M input tokens) analysing it costs ~$0.04. Same report in Dutch: ~17,500 tokens ≈ $0.05.

Frequently asked questions

Is Dutch more expensive than English?

Yes, ~25-40% more tokens for the same content. For production: factor this into cost modelling.

Related terms

Further reading

Need help with SEO or GEO?

We help Bitcoin, AI and fintech companies get found in Google and in AI search engines.

Book a call