Hybrid Search
New to memtomem? Start with the Memory Persistence Across Sessions tutorial to see the basic search flow in action first.
memtomem’s hybrid search combines keyword search and semantic search, leveraging both exact term matching and meaning-based similarity in a single query.
Why both?
Section titled “Why both?”Keyword search finds exact names like mem_search or FastAPI — things a vector model often misses because their meaning isn’t distributed across the embedding space. Semantic search finds ideas like “how do I deploy?” that match documents using different wording. Running both and merging the rankings covers both cases.
Search Architecture
Section titled “Search Architecture”Hybrid search runs three search engines in parallel:
| Engine | Based on | Strength |
|---|---|---|
| BM25 | SQLite FTS5 | Exact keyword/term matching. Strong for unique identifiers like “FastAPI”, “mem_search” |
| Vector search | sqlite-vec + ONNX/Ollama/OpenAI embeddings | Semantic similarity. Can match “how to deploy” → “deployment checklist” |
| RRF fusion | Reciprocal Rank Fusion | Combines rankings from both engines into a final score |
Reranker Pool Tuning
Section titled “Reranker Pool Tuning”When reranking is enabled, the reranker sees a candidate pool of size max(min_pool, min(max_pool, int(oversample * response_top_k))). The defaults (oversample 2.0, min_pool 20, max_pool 200) give the classic 2× oversample at top_k=10 and scale up with larger top_k requests. Tune with:
| Key | Env var | Default | Notes |
|---|---|---|---|
rerank.oversample | MEMTOMEM_RERANK__OVERSAMPLE | 2.0 | Pool multiplier over response_top_k |
rerank.min_pool | MEMTOMEM_RERANK__MIN_POOL | 20 | Floor — reranker never sees fewer candidates |
rerank.max_pool | MEMTOMEM_RERANK__MAX_POOL | 200 | Cap — prevents runaway cost on large top_k |
Runtime-tunable via mm config set rerank.oversample 3.0 etc. rerank.top_k is deprecated; use min_pool instead.
Semantic Chunking
Section titled “Semantic Chunking”During indexing, supported documents are split into meaningful units by structure before the short-section merge pass runs.
| Chunker | Target | Behavior |
|---|---|---|
| Markdown | .md files | Split by heading level, preserving hierarchy |
| Structured data | .json, .yaml, .yml, .toml files | Top-level key splitting, with recursive mode available via config |
| Code | .py, .js, .ts, .tsx, .jsx files | Function / class-aware splitting when code chunking extras are installed |
Very short sections are greedily packed with adjacent siblings up to indexing.target_chunk_tokens (default 384) to keep each chunk informative enough to retrieve. Set target_chunk_tokens=0 to disable the pass and keep every small section as its own chunk.
Directory indexing is extension-filtered. If a file type is not chunked by the active registry, it is skipped rather than indexed as plain text.
Incremental Indexing
Section titled “Incremental Indexing”Instead of full re-indexing, only changed chunks are updated:
- Store SHA-256 hash for each chunk
- On re-index, compare hashes to detect changes
- Only re-embed changed chunks
This minimizes indexing cost even for large document sets.
Namespaces
Section titled “Namespaces”Organize memories into scoped groups:
- Namespace auto-derived from folder names
- Filter by namespace when searching
- Support agent-level isolation and sharing in multi-agent environments
Maintenance Features
Section titled “Maintenance Features”- Near-duplicate detection — automatically identify memories with nearly identical content
- Time-based decay — gradually decrease search weight for older memories
- TTL expiration — automatically delete memories past their configured lifespan
- Auto-tagging — automatically assign tags based on content analysis