Skip to content

Hybrid Search

New to memtomem? Start with the Memory Persistence Across Sessions tutorial to see the basic search flow in action first.

memtomem’s hybrid search combines keyword search and semantic search, leveraging both exact term matching and meaning-based similarity in a single query.

Keyword search finds exact names like mem_search or FastAPI — things a vector model often misses because their meaning isn’t distributed across the embedding space. Semantic search finds ideas like “how do I deploy?” that match documents using different wording. Running both and merging the rankings covers both cases.

Hybrid search runs three search engines in parallel:

EngineBased onStrength
BM25SQLite FTS5Exact keyword/term matching. Strong for unique identifiers like “FastAPI”, “mem_search”
Vector searchsqlite-vec + ONNX/Ollama/OpenAI embeddingsSemantic similarity. Can match “how to deploy” → “deployment checklist”
RRF fusionReciprocal Rank FusionCombines rankings from both engines into a final score

When reranking is enabled, the reranker sees a candidate pool of size max(min_pool, min(max_pool, int(oversample * response_top_k))). The defaults (oversample 2.0, min_pool 20, max_pool 200) give the classic 2× oversample at top_k=10 and scale up with larger top_k requests. Tune with:

KeyEnv varDefaultNotes
rerank.oversampleMEMTOMEM_RERANK__OVERSAMPLE2.0Pool multiplier over response_top_k
rerank.min_poolMEMTOMEM_RERANK__MIN_POOL20Floor — reranker never sees fewer candidates
rerank.max_poolMEMTOMEM_RERANK__MAX_POOL200Cap — prevents runaway cost on large top_k

Runtime-tunable via mm config set rerank.oversample 3.0 etc. rerank.top_k is deprecated; use min_pool instead.

During indexing, supported documents are split into meaningful units by structure before the short-section merge pass runs.

ChunkerTargetBehavior
Markdown.md filesSplit by heading level, preserving hierarchy
Structured data.json, .yaml, .yml, .toml filesTop-level key splitting, with recursive mode available via config
Code.py, .js, .ts, .tsx, .jsx filesFunction / class-aware splitting when code chunking extras are installed

Very short sections are greedily packed with adjacent siblings up to indexing.target_chunk_tokens (default 384) to keep each chunk informative enough to retrieve. Set target_chunk_tokens=0 to disable the pass and keep every small section as its own chunk.

Directory indexing is extension-filtered. If a file type is not chunked by the active registry, it is skipped rather than indexed as plain text.

Instead of full re-indexing, only changed chunks are updated:

  1. Store SHA-256 hash for each chunk
  2. On re-index, compare hashes to detect changes
  3. Only re-embed changed chunks

This minimizes indexing cost even for large document sets.

Organize memories into scoped groups:

  • Namespace auto-derived from folder names
  • Filter by namespace when searching
  • Support agent-level isolation and sharing in multi-agent environments
  • Near-duplicate detection — automatically identify memories with nearly identical content
  • Time-based decay — gradually decrease search weight for older memories
  • TTL expiration — automatically delete memories past their configured lifespan
  • Auto-tagging — automatically assign tags based on content analysis