Compression Strategies
memtomem-stm automatically compresses MCP tool responses by content type to save tokens. It provides 10 strategies that reduce response size while preserving the information the agent needs.
10 Compression Strategies
Section titled “10 Compression Strategies”| Strategy | Target content | Behavior |
|---|---|---|
| truncate | Small text | Length-limited truncation (default fallback) |
| hybrid | Markdown | Preserve structure + abbreviate non-essential sections |
| selective | General text | Keep only query-relevant portions |
| progressive | Large content | Cursor-based sequential delivery (zero information loss) |
| extract_fields | JSON dictionaries | Extract key fields only |
| schema_pruning | JSON arrays | Preserve schema + reduce samples |
| skeleton | API docs | Preserve structural skeleton only |
| llm_summary | Complex text | LLM-based summarization (Ollama/OpenAI) |
| auto | All types | Analyze content and auto-select optimal strategy |
| none | — | Pass through original without compression |
Auto-Selection Logic
Section titled “Auto-Selection Logic”The auto strategy (default) analyzes content to pick the optimal strategy:
| Content type | Selected strategy |
|---|---|
| JSON dictionary | extract_fields |
| Large JSON array | schema_pruning |
| Markdown document | hybrid |
| API documentation | skeleton |
| Small text (< threshold) | truncate |
| Other large text | selective |
Query-Aware Budget Allocation
Section titled “Query-Aware Budget Allocation”During compression, the agent’s current query is taken into account — relevant sections receive a larger token budget. For example, when compressing API documentation while the agent is asking about “authentication module,” auth-related endpoints get more space.
Zero Information Loss: Progressive Delivery
Section titled “Zero Information Loss: Progressive Delivery”The progressive strategy delivers large content without any information loss:
- First response delivers a table of contents (TOC) and the first chunk
- Agent requests more → cursor-based delivery of subsequent chunks
- Full content can be inspected sequentially
Fallback Ladder
Section titled “Fallback Ladder”When the compression ratio guardrail (default 65% preservation) is violated, a 3-tier fallback activates automatically:
progressive → hybrid → truncateEach tier checks the guardrail — if satisfied, that strategy’s output is used.
Compression Budget Tuning
Section titled “Compression Budget Tuning”Agent feedback automatically adjusts per-tool compression budgets:
- Agent reports information loss → Increase preservation ratio for that tool
- Agent reports response too long → Decrease preservation ratio