Skip to content

NevaMind-AI/memU

Repository files navigation

MemU Banner

MemU

A Future-Oriented Agentic Memory System

PyPI version License: Apache 2.0 Python 3.13+ Discord Twitter


MemU is an agentic memory framework for LLM and AI agent backends. It receives multimodal inputs (conversations, documents, images), extracts them into structured memory, and organizes them into a hierarchical file system that supports both embedding-based (RAG) and non-embedding (LLM) retrieval.


MemU is collaborating with four open-source projects to launch the 2026 New Year Challenge. πŸŽ‰Between January 8–18, contributors can submit PRs to memU and earn cash rewards, community recognition, and platform credits. 🎁Learn more & get involved

✨ Core Features

Feature Description
πŸ—‚οΈ Hierarchical File System Three-layer architecture: Resource β†’ Item β†’ Category with full traceability
πŸ” Dual Retrieval Methods RAG (embedding-based) for speed, LLM (non-embedding) for deep semantic understanding
🎨 Multimodal Support Process conversations, documents, images, audio, and video
πŸ”„ Self-Evolving Memory Memory structure adapts and improves based on usage patterns

πŸ—‚οΈ Hierarchical File System

MemU organizes memory using a three-layer architecture inspired by hierarchical storage systems:

structure

Layer Description Examples
Resource Raw multimodal data warehouse JSON conversations, text documents, images, videos
Item Discrete extracted memory units Individual preferences, skills, opinions, habits
Category Aggregated textual memory with summaries preferences.md, work_life.md, relationships.md

Key Benefits:

  • Full Traceability: Track from raw data β†’ items β†’ categories and back
  • Progressive Summarization: Each layer provides increasingly abstracted views
  • Flexible Organization: Categories evolve based on content patterns

🎨 Multimodal Support

MemU processes diverse content types into unified memory:

Modality Input Processing
conversation JSON chat logs Extract preferences, opinions, habits, relationships
document Text files (.txt, .md) Extract knowledge, skills, facts
image PNG, JPG, etc. Vision model extracts visual concepts and descriptions
video Video files Frame extraction + vision analysis
audio Audio files Transcription + text processing

All modalities are unified into the same three-layer hierarchy, enabling cross-modal retrieval.


πŸš€ Quick Start

Option 1: Cloud Version

Try MemU instantly without any setup:

πŸ‘‰ memu.so - Hosted cloud service with full API access

For enterprise deployment and custom solutions, contact [email protected]

Cloud API (v3)

Base URL https://api.memu.so
Auth Authorization: Bearer YOUR_API_KEY
Method Endpoint Description
POST /api/v3/memory/memorize Register a memorization task
GET /api/v3/memory/memorize/status/{task_id} Get task status
POST /api/v3/memory/categories List memory categories
POST /api/v3/memory/retrieve Retrieve memories (semantic search)

πŸ“š Full API Documentation


Option 2: Self-Hosted

Installation

pip install -e .

Basic Example

Requirements: Python 3.13+ and an OpenAI API key

Test with In-Memory Storage (no database required):

export OPENAI_API_KEY=your_api_key
cd tests
python test_inmemory.py

Test with PostgreSQL Storage (requires pgvector):

# Start PostgreSQL with pgvector
docker run -d \
  --name memu-postgres \
  -e POSTGRES_USER=postgres \
  -e POSTGRES_PASSWORD=postgres \
  -e POSTGRES_DB=memu \
  -p 5432:5432 \
  pgvector/pgvector:pg16

# Run the test
export OPENAI_API_KEY=your_api_key
cd tests
python test_postgres.py

Both examples demonstrate the complete workflow:

  1. Memorize: Process a conversation file and extract structured memory
  2. Retrieve (RAG): Fast embedding-based search
  3. Retrieve (LLM): Deep semantic understanding search

See tests/test_inmemory.py and tests/test_postgres.py for the full source code.


Custom LLM and Embedding Providers

MemU supports custom LLM and embedding providers beyond OpenAI. Configure them via llm_profiles:

from memu import MemUService

service = MemUService(
    llm_profiles={
        # Default profile for LLM operations
        "default": {
            "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
            "api_key": "your_api_key",
            "chat_model": "qwen3-max",
            "client_backend": "sdk"  # "sdk" or "http"
        },
        # Separate profile for embeddings
        "embedding": {
            "base_url": "https://api.voyageai.com/v1",
            "api_key": "your_voyage_api_key",
            "embed_model": "voyage-3.5-lite"
        }
    },
    # ... other configuration
)

πŸ“– Core APIs

memorize() - Extract and Store Memory

Processes input resources and extracts structured memory:

memorize

result = await service.memorize(
    resource_url="path/to/file.json",  # File path or URL
    modality="conversation",            # conversation | document | image | video | audio
    user={"user_id": "123"}             # Optional: scope to a user
)

# Returns:
{
    "resource": {...},      # Stored resource metadata
    "items": [...],         # Extracted memory items
    "categories": [...]     # Updated category summaries
}

retrieve() - Query Memory

Retrieves relevant memory based on queries. MemU supports two retrieval strategies:

retrieve

RAG-based Retrieval (method="rag")

Fast embedding vector search using cosine similarity:

  • βœ… Fast: Pure vector computation
  • βœ… Scalable: Efficient for large memory stores
  • βœ… Returns scores: Each result includes similarity score

LLM-based Retrieval (method="llm")

Deep semantic understanding through direct LLM reasoning:

  • βœ… Deep understanding: LLM comprehends context and nuance
  • βœ… Query rewriting: Automatically refines query at each tier
  • βœ… Adaptive: Stops early when sufficient information is found

Comparison

Aspect RAG LLM
Speed ⚑ Fast 🐒 Slower
Cost πŸ’° Low πŸ’°πŸ’° Higher
Semantic depth Medium Deep
Tier 2 scope All items Only items in relevant categories
Output With similarity scores Ranked by LLM reasoning

Both methods support:

  • Context-aware rewriting: Resolves pronouns using conversation history
  • Progressive search: Categories β†’ Items β†’ Resources
  • Sufficiency checking: Stops when enough information is retrieved

Usage

result = await service.retrieve(
    queries=[
        {"role": "user", "content": {"text": "What are their preferences?"}},
        {"role": "user", "content": {"text": "Tell me about work habits"}}
    ],
    where={"user_id": "123"}  # Optional: scope filter
)

# Returns:
{
    "categories": [...],     # Relevant categories (with scores for RAG)
    "items": [...],          # Relevant memory items
    "resources": [...],      # Related raw resources
    "next_step_query": "..." # Rewritten query for follow-up (if applicable)
}

Scope Filtering: Use where to filter by user model fields:

  • where={"user_id": "123"} - exact match
  • where={"agent_id__in": ["1", "2"]} - match any in list
  • Omit where to retrieve across all scopes

πŸ“š For complete API documentation, see SERVICE_API.md - includes all methods, CRUD operations, pipeline configuration, and configuration types.


πŸ’‘ Use Cases

Example 1: Conversation Memory

Extract and organize memory from multi-turn conversations:

export OPENAI_API_KEY=your_api_key
python examples/example_1_conversation_memory.py

What it does:

  • Processes multiple conversation JSON files
  • Extracts memory items (preferences, habits, opinions, relationships)
  • Generates category markdown files (preferences.md, work_life.md, etc.)

Best for: Personal AI assistants, customer support bots, social chatbots


Example 2: Skill Extraction from Logs

Extract skills and lessons learned from agent execution logs:

export OPENAI_API_KEY=your_api_key
python examples/example_2_skill_extraction.py

What it does:

  • Processes agent logs sequentially
  • Extracts actions, outcomes, and lessons learned
  • Demonstrates incremental learning - memory evolves with each file
  • Generates evolving skill guides (log_1.md β†’ log_2.md β†’ skill.md)

Best for: DevOps teams, agent self-improvement, knowledge management


Example 3: Multimodal Memory

Process diverse content types into unified memory:

export OPENAI_API_KEY=your_api_key
python examples/example_3_multimodal_memory.py

What it does:

  • Processes documents and images together
  • Extracts memory from different content types
  • Unifies into cross-modal categories (technical_documentation, visual_diagrams, etc.)

Best for: Documentation systems, learning platforms, research tools


πŸ“Š Performance

MemU achieves 92.09% average accuracy on the Locomo benchmark across all reasoning tasks.

benchmark

View detailed experimental data: memU-experiment


🧩 Ecosystem

Repository Description Use Case
memU Core algorithm engine Embed AI memory into your product
memU-server Backend service with CRUD, user system, RBAC Self-host a memory backend
memU-ui Visual dashboard Ready-to-use memory console

Quick Links:



🀝 Partners

Ten OpenAgents Milvus xRoute Jazz Buddie Bytebase LazyLLM


πŸ“„ License

Apache License 2.0


🌍 Community


⭐ Star us on GitHub to get notified about new releases!