Skip to main content

Automatic Context Summarization

When your conversation approaches the model's context window limit, PostQode automatically summarizes it to free up space and keep working.

How It Works

Automatic Summarization

PostQode monitors token usage during your conversation. When you're getting close to the limit, he:

  1. Creates a comprehensive summary of everything that's happened
  2. Preserves all the technical details, code changes, and decisions
  3. Replaces the conversation history with the summary
  4. Continues exactly where he left off

You'll see a summarization tool call when this happens, showing the total cost like any other api call in the chat view.

Manual Compacting

In addition to automatic summarization, you can also manually trigger compacting at any time during your conversation. This is useful when:

  • You want to free up context space before starting a new complex task
  • You've completed a major milestone and want to consolidate the conversation
  • You're approaching the context limit and want to control when summarization happens
  • You want to reduce token usage for subsequent requests

To manually compact your conversation:

  1. Click the "Compact Task" button in the chat interface (shown when context usage is high)
Compact Task button in the chat interface
  1. Confirm the action in the dialog that appears
Compact Task confirmation dialog
  1. PostQode will immediately summarize the conversation and free up context space
When to Use Manual Compacting
  • Before complex tasks: Free up space before asking PostQode to work on large features
  • After milestones: Consolidate context after completing major work
  • Proactive management: Compact before hitting the automatic threshold
  • Cost optimization: Reduce context size to lower token costs on subsequent requests

Why This Matters

Previously, PostQode would truncate older messages when hitting context limits. This meant losing important context from earlier in the conversation.

Now with summarization:

  • All technical decisions and code patterns are preserved
  • File changes and project context remain intact
  • PostQode remembers everything he's done
  • You can work on much larger projects without interruption
tip

Context Summarization synergizes beautifully with Focus Chain. When Focus Chain is enabled, todo lists persist across summarizations. This means PostQode can work on long-horizon tasks that span multiple context windows while staying on track with the todo list guiding him through each reset.

Cost Considerations

Summarization leverages your existing prompt cache from the conversation, so it costs about the same as any other tool call.

Since most input tokens are already cached, you're primarily paying for the summary generation (output tokens), making it very cost-effective.

Restoring Context with Checkpoints

You can use checkpoints to restore your task state from before a summarization occurred. This means you never truly lose context - you can always roll back to previous versions of your conversation.

note

Editing a message before a summarization tool call will work similarly to a checkpoint, allowing you to restore the conversation to that point.

Next Generation Model Support

Auto Compact uses advanced LLM-based summarization which we've found works significantly better for next-generation models. We currently support this feature for the following models:

  • Claude 4 series
  • Gemini 2.5 series
  • GPT-5
  • Grok 4
note

When using other models, PostQode automatically falls back to the standard rule-based context truncation method, even if Auto Compact is enabled in settings.