Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.memmachine.ai/llms.txt

Use this file to discover all available pages before exploring further.

For the full source code and advanced implementation details, see the official LlamaIndex Integration in our repository.

Overview

Integrating MemMachine with LlamaIndex provides a persistent memory layer for chat engines. This allows agents to:
  • Recall User Profiles: Surface user preferences and facts directly into the prompt context.
  • Maintain Context Across Sessions: Stored episodic and semantic memories persist beyond a single execution.
  • Intelligent Injection: MemMachine automatically injects relevant context as a system message during inference.

Configuration

You can configure the LlamaIndex memory component using environment variables or direct constructor parameters.
ParameterEnvironment VariableDefaultDescription
base_urlMEMORY_BACKEND_URLhttp://localhost:8080MemMachine server URL
org_idLLAMAINDEX_ORG_IDllamaindex_orgOrganization ID
project_idLLAMAINDEX_PROJECT_IDllamaindex_projectProject ID
user_idLLAMAINDEX_USER_IDNoneUser identifier
agent_idLLAMAINDEX_AGENT_IDNoneAgent identifier
session_idLLAMAINDEX_SESSION_IDNoneSession identifier

1

Install Dependencies

Install the core LlamaIndex framework and the updated MemMachine client:
pip install llama-index memmachine-client
2

Initialize MemMachine Memory

Import the MemMachineMemory class and configure it with your project identifiers.
from mem_machine_memory import MemMachineMemory

memory = MemMachineMemory(
    base_url="http://localhost:8080",
    org_id="my_org",
    project_id="my_project",
    user_id="user_123",
    session_id="session_456"
)
3

Build the Chat Engine

Equip your LlamaIndex SimpleChatEngine with the persistent memory instance.
from llama_index.core.chat_engine import SimpleChatEngine
from llama_index.llms.openai import OpenAI

# Ensure your API key is configured
llm = OpenAI(api_key="your-openai-api-key")

agent = SimpleChatEngine.from_defaults(
    llm=llm, 
    memory=memory
)

# First interaction stores facts
print(agent.chat("I am Alice, I like Python programming."))

# Subsequent interactions recall them
print(agent.chat("What do you know about me?"))
Pro Tip: Tune the search_msg_limit parameter to balance the depth of recall against context window usage and latency.

Requirements

  • MemMachine Server: Must be running (default: http://localhost:8080).
  • Python: 3.12 or higher.
  • LLM: An OpenAI-compatible LLM provider.