AI & agents
Context Window Planner
Estimate prompt token usage, reserve output space, and split long documents into agent-ready context chunks locally in your browser.
How to use this tool
Paste the long document, logs, notes, or repo context you want to give to an AI assistant.
Set the model context window, reserved output budget, instruction budget, target chunk size, and overlap.
Copy the chunk manifest when the content is too large to fit comfortably in one prompt.
Why context budgeting matters
Large context windows are not free space. You still need room for system instructions, task instructions, tool outputs, reasoning overhead, and the final answer.
If a prompt fills the entire window, the assistant may have too little room to respond or may lose important details during truncation.
Planning token budgets before a long agent run helps keep source material organized and easier to cite.
Chunking for AI agents
Good chunks preserve headings, nearby paragraphs, and enough overlap for continuity.
Chunk manifests help an agent know which piece it is reading and how much context it contains.
For code or logs, prefer natural boundaries such as files, stack traces, sections, or timestamps instead of arbitrary cuts.
Token estimates are approximate
Different models and tokenizers count text differently.
This planner uses a local approximation that is good enough for budgeting, not exact billing or hard context-limit enforcement.
When precision matters, use the tokenizer for the exact model you plan to call.
Examples
Reserve output space
The available context budget is the remaining space after reserving output and instruction tokens.
InputContext window: 128000, output reserve: 8000, instructions: 2500
Agent chunk header
Output## Chunk 1/4 Estimated tokens: 5800 Paste the first section of source material here.
FAQ
Is the token count exact?
No. It is an approximation based on text length and word count. Exact counts require the tokenizer for a specific model.
Does Lumio upload my document?
No. The token estimate and chunking run locally in your browser and document contents are not stored for analytics.
How much output space should I reserve?
Reserve more space for code patches, long analysis, structured reports, or multi-step agent work. A few thousand tokens is often a practical minimum.
Why use overlap between chunks?
Overlap keeps nearby context available when a section crosses a chunk boundary, which can reduce lost references and abrupt transitions.