Skip to main content
Mythos

Context Window Management

Part of: Effective AI Utilization — Table of Contents

Every AI model has a finite context window. How you fill that window determines the quality of the output. Stuff it with irrelevant context and you get diluted responses. Trim too aggressively and the model lacks the information it needs.

BrianBot's Approach: Rolling Window

BrianBot's single context management strategy is a 30-day rolling window for memory summaries loaded into the transcript prompt. Memories older than 30 days are dropped. This is time-based filtering — simple, predictable, but not semantic. A memory from 45 days ago might be more relevant than one from yesterday.

Strategies by Sophistication

Time-based filtering (BrianBot today): Drop context older than N days. Cheap, fast, lossy.

Summarization chains: Instead of dropping old context, summarize it. A month of memories becomes a paragraph. Two months becomes a sentence. Recursive summarization preserves information at decreasing fidelity.

Relevance-based selection: Embed the current query and the available context, select by semantic similarity. This is what RAG systems do (see the Vector Search plan for MythOS). More compute upfront, but dramatically better context quality.

Priority-based budgeting: Allocate portions of the context window to different categories: 40% for the primary content, 30% for relevant history, 20% for system instructions, 10% for examples. Enforce the budget by truncating the lowest-priority category first.

Pre-flight Token Counting

BrianBot doesn't count tokens before submission — a gap that risks silent truncation or API errors on oversized prompts. The fix: use a tokenizer (tiktoken for OpenAI, Anthropic's token counter) to measure the assembled prompt before sending, and apply your context management strategy if it exceeds the model's window minus your maxTokens reservation.

The Context Quality Principle

More context is not better context. A focused 2000-token prompt with exactly the right information outperforms a 50000-token prompt with everything possibly relevant. The goal isn't to fill the window — it's to give the model exactly what it needs and nothing more.

Related: Token Optimization Playbook, Prompt Architecture, AI Pipeline Design

🏷️#ai 🏷️#context-window 🏷️#optimization 🏷️#brianbot

Created with 💜 by One Inc | Copyright 2026