feat: graph-enhanced retrieval with PPR and community detection by 2233admin · Pull Request #395 · NevaMind-AI/memU

2233admin · 2026-03-28T07:05:56Z

�[38;5;8m 1�[0m �[37m## Summary�[0m
�[38;5;8m 2�[0m
�[38;5;8m 3�[0m �[37mAdds graph-enhanced retrieval that layers a knowledge graph on top of existing vector search for more contextual memory recall. When enabled, the retrieve pipeline fuses vector similarity scores with graph-based Personalized PageRank (PPR) scores, surfacing memories that are both semantically relevant and structurally connected.�[0m
�[38;5;8m 4�[0m
�[38;5;8m 5�[0m �[37m## Core Changes�[0m
�[38;5;8m 6�[0m
�[38;5;8m 7�[0m �[37m### Phase 1: GraphStore Module�[0m
�[38;5;8m 8�[0m �[37m- Domain models: GraphNode, GraphEdge, GraphCommunity (in database/models.py)�[0m
�[38;5;8m 9�[0m �[37m- ORM models: SQLAlchemy models registered in schema.py via get_sqlalchemy_models()�[0m
�[38;5;8m 10�[0m �[37m- Alembic migration: 001_add_graph_tables.py creates gm_nodes, gm_edges, gm_communities tables (idempotent with IF NOT EXISTS)�[0m
�[38;5;8m 11�[0m �[37m- GraphStore repository (repositories/graph_store.py): Full CRUD, load_graph() for in-memory NetworkX representation, PPR/LPA algorithms, dual-path recall (precise entity match + community expansion)�[0m
�[38;5;8m 12�[0m
�[38;5;8m 13�[0m �[37m### Phase 2: Retrieve Pipeline Integration�[0m
�[38;5;8m 14�[0m �[37m- RetrieveGraphConfig in settings.py: enabled (default: False), weight (default: 0.3), max_nodes (default: 5)�[0m
�[38;5;8m 15�[0m �[37m- recall_graph WorkflowStep: Injected before build_context in the retrieve pipeline — finds seed nodes from query, expands via communities, ranks by PPR�[0m
�[38;5;8m 16�[0m �[37m- Score fusion in _rag_build_context: final_score = α * vector_score + β * graph_score where α + β = 1.0�[0m
�[38;5;8m 17�[0m �[37m- Graph nodes included in retrieve response for transparency�[0m
�[38;5;8m 18�[0m
�[38;5;8m 19�[0m �[37m### Phase 3: Tests�[0m
�[38;5;8m 20�[0m �[37m- 30 unit tests covering: PPR (8), Global PageRank (3), LPA (4), merge results (4), score fusion (3), config (4), domain models (3), ORM models (1)�[0m
�[38;5;8m 21�[0m
�[38;5;8m 22�[0m �[37m## Configuration�[0m
�[38;5;8m 23�[0m
�[38;5;8m 24�[0m �[37mpython�[0m �[38;5;8m 25�[0m �[37mfrom memu import MemU�[0m �[38;5;8m 26�[0m �[38;5;8m 27�[0m �[37mm = MemU.from_config(�[0m �[38;5;8m 28�[0m �[37m retrieve_graph={�[0m �[38;5;8m 29�[0m �[37m "enabled": True,�[0m �[38;5;8m 30�[0m �[37m "weight": 0.3, # graph contribution to final score�[0m �[38;5;8m 31�[0m �[37m "max_nodes": 5 # max graph nodes per recall�[0m �[38;5;8m 32�[0m �[37m }�[0m �[38;5;8m 33�[0m �[37m)�[0m �[38;5;8m 34�[0m �[37m�[0m
�[38;5;8m 35�[0m
�[38;5;8m 36�[0m �[37mDefault: disabled — zero impact on existing users. No graph tables are queried unless enabled=True.�[0m
�[38;5;8m 37�[0m
�[38;5;8m 38�[0m �[37m## Files Changed�[0m
�[38;5;8m 39�[0m
�[38;5;8m 40�[0m �[37m| File | Change |�[0m
�[38;5;8m 41�[0m �[37m|------|--------|�[0m
�[38;5;8m 42�[0m �[37m| database/models.py | +3 domain dataclasses |�[0m
�[38;5;8m 43�[0m �[37m| database/postgres/models.py | +3 SQLAlchemy ORM models |�[0m
�[38;5;8m 44�[0m �[37m| database/postgres/schema.py | Register graph models |�[0m
�[38;5;8m 45�[0m �[37m| database/postgres/postgres.py | Wire GraphStore into Database |�[0m
�[38;5;8m 46�[0m �[37m| database/postgres/repositories/graph_store.py | New — 800 LOC repository |�[0m
�[38;5;8m 47�[0m �[37m| database/postgres/migrations/versions/001_add_graph_tables.py | New — Alembic migration |�[0m
�[38;5;8m 48�[0m �[37m| app/settings.py | RetrieveGraphConfig dataclass |�[0m
�[38;5;8m 49�[0m �[37m| app/retrieve.py | recall_graph step + score fusion |�[0m
�[38;5;8m 50�[0m �[37m| README.md | Graph-enhanced retrieval section |�[0m
�[38;5;8m 51�[0m �[37m| tests/test_graph_store.py | New — 30 tests |�[0m
�[38;5;8m 52�[0m
�[38;5;8m 53�[0m �[37m## Known Limitations�[0m
�[38;5;8m 54�[0m
�[38;5;8m 55�[0m �[37mThese are pre-existing design constraints, not introduced by this PR:�[0m
�[38;5;8m 56�[0m
�[38;5;8m 57�[0m �[37m1. ddl_mode="validate" still runs Alembic upgrade() — the migration is safe (uses IF NOT EXISTS) but this behavior predates this PR�[0m
�[38;5;8m 58�[0m �[37m2. Migration hard-codes user_id as the scope column; projects using dynamic scope models may need to adjust�[0m
�[38;5;8m 59�[0m
�[38;5;8m 60�[0m �[37m## Breaking Changes�[0m
�[38;5;8m 61�[0m
�[38;5;8m 62�[0m �[37mNone. Graph retrieval is fully opt-in via configuration. Existing retrieve behavior is unchanged when graph.enabled=False (the default).�[0m

- GraphNode/GraphEdge/GraphCommunity domain models + SQLModel ORM - GraphStore repository with CRUD + dual-path graph recall + PPR/LPA - Alembic migration for gm_* tables with scope column support - Wired into PostgresStore alongside existing repos - 77 existing tests still passing

- RetrieveGraphConfig: enabled, weight (β), max_nodes - recall_graph WorkflowStep in RAG workflow - Score fusion in _rag_build_context: vector*α + graph*β - graph_nodes[] in retrieve response - 77 tests pass, E2E verified with live PG data

Tests cover: PPR algorithm (8), global PageRank (3), LPA community detection (4), merge results (4), score fusion (3), config (4), domain models (3), ORM registration (1). All pure-Python, no DB needed.

Migration file was lost from working tree (present in e844af4 but not in subsequent working directory state). DB was already at 002_relation_category revision with the schema applied; alembic failed on init because it could not locate the revision file. Restored via: git checkout e844af4 -- <migration_path>

…tions

…reation

…ity filtering

- admission.py: AdmissionGate with noise pattern detection, min_length check, configurable threshold - test_admission.py: 14 tests covering disabled pass-through, min_length, noise blocking, custom patterns

2233admin added 9 commits March 28, 2026 15:04

test: add 30 unit tests for graph store (Phase 3)

ad1c9cb

Tests cover: PPR algorithm (8), global PageRank (3), LPA community detection (4), merge results (4), score fusion (3), config (4), domain models (3), ORM registration (1). All pure-Python, no DB needed.

feat: add graph_store attribute to database interfaces and implementa…

e23481b

…tions

feat: add scope filtering to graph store edge queries and community c…

68e8f49

…reation

feat: add admission gate for memorize workflow with configurable qual…

95046d6

…ity filtering

feat: add AdmissionGate class and tests (admission gate feature)

d074885

- admission.py: AdmissionGate with noise pattern detection, min_length check, configurable threshold - test_admission.py: 14 tests covering disabled pass-through, min_length, noise blocking, custom patterns

fix(lint): resolve ruff errors in graph_store (C416/S311/E741/F841)

e0a592e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: graph-enhanced retrieval with PPR and community detection#395

feat: graph-enhanced retrieval with PPR and community detection#395
2233admin wants to merge 9 commits intoNevaMind-AI:mainfrom
2233admin:feat/graph-enhanced-retrieval

2233admin commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

2233admin commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant