memos online api eval scripts and readme (#403)

Nyakult · 2Rant · fridayL · web-flow · commit 5ff29d117ad8 · 2025-10-28T18:56:40.000+08:00
* feat: check nodes existence

* feat: use different template for different language input

* feat: use different template for different language input

* fix: eval script

* feat: memos-api eval scripts

* feat: mem reader

* feat: 实现äºprefeval memos-api evaluation scripts

* refactor:format code

* feat: add PersonaMem eval scripts

* docs(evaluation): update PersonaMem eval readme

* feat:memos-api ingest batch message

* feat: refactor search

* feat: refactor search

* update: add api for memory

* feat: add memory api return memory and memory type

* refactor(server):重构服务器路由模块以优化内存管理

* format: ruff format code

* feat(server): 增加LLM最大令牌数

* test

* fix: user query embedding for search

* count memory_size by user

* fix(server):修复记忆读取逻辑中的列表展开问题

* feat(nebular):优化图数据库查询性能

* refactor(memory):
- 移除了对 `_refresh_memory_size` 方法的调用- 保留原有逻辑以便后续恢复或重构

* feat: remove user idx_memory_user_name

* feat(graph):优化Nebula图数据库查询性能

* feat: rollback remove_oldest_memory

* feat:nebula gql add index

* feat: align code

* feat: update memos_api

* feat: update memos_api

* feat: 更新默认选项

* feat:memory client

* feat:refactor lme

* feat: memu &amp; supermemory client

* feat: locomo memu

* feat: locomo supermemory

* New 'add' and 'process' modes.

* feat: lme supermemory &amp; memu

* feat: default args

* api and local

* api and local

* memobase fix

* memos fix

* default args

* fix memos-api search data

* prefeval pipeline

* fix lme memos-api

* personamem pipeline

* personamem pipeline

* lme scrips

* align dev

* format code

* refactor: remove old files

* format code

* pm and prefeval pipeline

* format code

* format code

* pm and prefeval pipeline

* pm and prefeval pipeline

* pm and prefeval pipeline

* format code

* format code

* pref pipeline

* add search response mode

* add search response mode

* update readme and example

* update mem0 api

* pm mem0

* fix MEMOBASE api

* update pm and prefeval pipepline for frames

* update pm and prefeval readme

* format code

* fix memobase api

* fix memobase api

* format code

* format code

* fix format

* fix format

* fix format

* mem0 api

* memos batch add

* add memos-api-online

* add memos-api-online update readme

* rollback manager

* memos online api pref mem

---------

Co-authored-by: 2Rant &lt;junlin1105@sjtu.edu.cn&gt;
Co-authored-by: fridayL &lt;lcy081099@gmail.com&gt;
Co-authored-by: CaralHsi &lt;caralhsi@gmail.com&gt;
diff --git a/evaluation/README.md b/evaluation/README.md
@@ -22,17 +22,32 @@ This repository provides tools and scripts for evaluating the LoCoMo dataset usi
 2. Copy the `configs-example/` directory to a new directory named `configs/`, and modify the configuration files inside it as needed. This directory contains model and API-specific settings.
 
 ## Setup MemOS
+### local server
 ```bash
-#start server
+# modify {project_dir}/.env file and start server
 uvicorn memos.api.server_api:app --host 0.0.0.0 --port 8001 --workers 8
 
-# modify .env file
+# configure {project_dir}/evaluation/.env file
 MEMOS_URL="http://127.0.0.1:8001"
 ```
+### online service
+```bash
+# get your api key at https://memos-dashboard.openmem.net/cn/quickstart/
+# configure {project_dir}/evaluation/.env file
+MEMOS_KEY="Token mpg-xxxxx"
+MEMOS_ONLINE_URL="https://memos.memtensor.cn/api/openmem/v1"
+
+```
+
+## Supported frameworks
+We support `memos-api` and `memos-api-online` in our scripts.
+And give unofficial implementations for the following memory frameworks:`zep`, `mem0`, `memobase`, `supermemory`, `memu`.
+
+
 ## Evaluation Scripts
 
 ### LoCoMo Evaluation
-⚙️ To evaluate the **LoCoMo** dataset using one of the supported memory frameworks — `memos`, `mem0`, or `zep` — run the following [script](./scripts/run_locomo_eval.sh):
+⚙️ To evaluate the **LoCoMo** dataset using one of the supported memory frameworks — run the following [script](./scripts/run_locomo_eval.sh):
 
 ```bash
 # Edit the configuration in ./scripts/run_locomo_eval.sh
@@ -53,7 +68,7 @@ First prepare the dataset `longmemeval_s` from https://huggingface.co/datasets/x
 ```
 
 ### PrefEval Evaluation
-To evaluate the **Prefeval** dataset using one of the supported memory frameworks — `memos`, `mem0`, or `zep` — run the following [script](./scripts/run_prefeval_eval.sh):
+To evaluate the **Prefeval** dataset using one of the supported memory frameworks — run the following [script](./scripts/run_prefeval_eval.sh):
 
 ```bash
 # Edit the configuration in ./scripts/run_prefeval_eval.sh
diff --git a/evaluation/scripts/locomo/locomo_eval.py b/evaluation/scripts/locomo/locomo_eval.py
@@ -363,7 +363,15 @@ async def limited_task(task):
     parser.add_argument(
         "--lib",
         type=str,
-        choices=["mem0", "mem0_graph", "openai", "memos-api", "memobase"],
+        choices=[
+            "mem0",
+            "mem0_graph",
+            "memos-api",
+            "memos-api-online",
+            "memobase",
+            "memu",
+            "supermemory",
+        ],
         default="memos-api",
     )
     parser.add_argument(
diff --git a/evaluation/scripts/locomo/locomo_ingestion.py b/evaluation/scripts/locomo/locomo_ingestion.py
@@ -44,26 +44,33 @@ def ingest_session(client, session, frame, version, metadata):
             speaker_a_messages.append({"role": "assistant", "content": data})
             speaker_b_messages.append({"role": "user", "content": data})
 
-    if frame == "memos-api":
+    if "memos-api" in frame:
         for m in speaker_a_messages:
             m["chat_time"] = iso_date
         for m in speaker_b_messages:
             m["chat_time"] = iso_date
-        client.add(speaker_a_messages, speaker_a_user_id, f"{conv_id}_{metadata['session_key']}")
-        client.add(speaker_b_messages, speaker_b_user_id, f"{conv_id}_{metadata['session_key']}")
+        client.add(
+            speaker_a_messages,
+            speaker_a_user_id,
+            f"{conv_id}_{metadata['session_key']}",
+            batch_size=2,
+        )
+        client.add(
+            speaker_b_messages,
+            speaker_b_user_id,
+            f"{conv_id}_{metadata['session_key']}",
+            batch_size=2,
+        )
     elif "mem0" in frame:
-        for i in range(0, len(speaker_a_messages), 2):
-            batch_messages_a = speaker_a_messages[i : i + 2]
-            batch_messages_b = speaker_b_messages[i : i + 2]
-            client.add(batch_messages_a, speaker_a_user_id, timestamp)
-            client.add(batch_messages_b, speaker_b_user_id, timestamp)
+        client.add(speaker_a_messages, speaker_a_user_id, timestamp, batch_size=2)
+        client.add(speaker_b_messages, speaker_b_user_id, timestamp, batch_size=2)
     elif frame == "memobase":
         for m in speaker_a_messages:
             m["created_at"] = iso_date
         for m in speaker_b_messages:
             m["created_at"] = iso_date
-        client.add(speaker_a_messages, speaker_a_user_id)
-        client.add(speaker_b_messages, speaker_b_user_id)
+        client.add(speaker_a_messages, speaker_a_user_id, batch_size=2)
+        client.add(speaker_b_messages, speaker_b_user_id, batch_size=2)
     elif frame == "memu":
         client.add(speaker_a_messages, speaker_a_user_id, iso_date)
         client.add(speaker_b_messages, speaker_b_user_id, iso_date)
@@ -103,6 +110,10 @@ def process_user(conv_idx, frame, locomo_df, version):
         from utils.client import MemosApiClient
 
         client = MemosApiClient()
+    elif frame == "memos-api-online":
+        from utils.client import MemosApiOnlineClient
+
+        client = MemosApiOnlineClient()
     elif frame == "memobase":
         from utils.client import MemobaseClient
 
@@ -187,7 +198,15 @@ def main(frame, version="default", num_workers=4):
     parser.add_argument(
         "--lib",
         type=str,
-        choices=["mem0", "mem0_graph", "memos-api", "memobase", "memu", "supermemory"],
+        choices=[
+            "mem0",
+            "mem0_graph",
+            "memos-api",
+            "memos-api-online",
+            "memobase",
+            "memu",
+            "supermemory",
+        ],
         default="memos-api",
     )
     parser.add_argument(
diff --git a/evaluation/scripts/locomo/locomo_metric.py b/evaluation/scripts/locomo/locomo_metric.py
@@ -9,7 +9,15 @@
 parser.add_argument(
     "--lib",
     type=str,
-    choices=["mem0", "mem0_graph", "openai", "memos-api", "memobase"],
+    choices=[
+        "mem0",
+        "mem0_graph",
+        "memos-api",
+        "memos-api-online",
+        "memobase",
+        "memu",
+        "supermemory",
+    ],
     default="memos-api",
 )
 parser.add_argument(
diff --git a/evaluation/scripts/locomo/locomo_responses.py b/evaluation/scripts/locomo/locomo_responses.py
@@ -134,7 +134,15 @@ async def main(frame, version="default"):
     parser.add_argument(
         "--lib",
         type=str,
-        choices=["mem0", "mem0_graph", "openai", "memos-api", "memobase"],
+        choices=[
+            "mem0",
+            "mem0_graph",
+            "memos-api",
+            "memos-api-online",
+            "memobase",
+            "memu",
+            "supermemory",
+        ],
         default="memos-api",
     )
     parser.add_argument(
diff --git a/evaluation/scripts/locomo/locomo_search.py b/evaluation/scripts/locomo/locomo_search.py
@@ -198,7 +198,7 @@ def search_query(client, query, metadata, frame, version, top_k=20):
         context, duration_ms = mem0_graph_search(
             client, query, speaker_a_user_id, speaker_b_user_id, top_k, speaker_a, speaker_b
         )
-    elif frame == "memos-api":
+    elif "memos-api" in frame:
         context, duration_ms = memos_api_search(
             client, query, speaker_a_user_id, speaker_b_user_id, top_k, speaker_a, speaker_b
         )
@@ -257,6 +257,10 @@ def process_user(conv_idx, locomo_df, frame, version, top_k=20, num_workers=1):
         from utils.client import MemosApiClient
 
         client = MemosApiClient()
+    elif frame == "memos-api-online":
+        from utils.client import MemosApiOnlineClient
+
+        client = MemosApiOnlineClient()
     elif frame == "memobase":
         from utils.client import MemobaseClient
 
@@ -336,7 +340,15 @@ def main(frame, version="default", num_workers=1, top_k=20):
     parser.add_argument(
         "--lib",
         type=str,
-        choices=["mem0", "mem0_graph", "memos-api", "memobase", "memu", "supermemory"],
+        choices=[
+            "mem0",
+            "mem0_graph",
+            "memos-api",
+            "memos-api-online",
+            "memobase",
+            "memu",
+            "supermemory",
+        ],
         default="memos-api",
     )
     parser.add_argument(
diff --git a/evaluation/scripts/longmemeval/lme_eval.py b/evaluation/scripts/longmemeval/lme_eval.py
@@ -344,7 +344,15 @@ async def main(frame, version, nlp_options, num_runs=3, num_workers=5):
     parser.add_argument(
         "--lib",
         type=str,
-        choices=["mem0", "mem0_graph", "memos-api", "memobase", "memu", "supermemory"],
+        choices=[
+            "mem0",
+            "mem0_graph",
+            "memos-api",
+            "memos-api-online",
+            "memobase",
+            "memu",
+            "supermemory",
+        ],
         default="memos-api",
     )
     parser.add_argument(
@@ -355,7 +363,7 @@ async def main(frame, version, nlp_options, num_runs=3, num_workers=5):
         type=str,
         nargs="+",
         default=["lexical"],
-        choices=["lexical", "semantic"],
+        choices=["lexical"],
         help="NLP options to use for evaluation.",
     )
     parser.add_argument(
diff --git a/evaluation/scripts/longmemeval/lme_ingestion.py b/evaluation/scripts/longmemeval/lme_ingestion.py
@@ -18,7 +18,7 @@ def ingest_session(session, date, user_id, session_id, frame, client):
     if "mem0" in frame:
         for _idx, msg in enumerate(session):
             messages.append({"role": msg["role"], "content": msg["content"][:8000]})
-            client.add(messages, user_id, int(date.timestamp()))
+        client.add(messages, user_id, int(date.timestamp()), batch_size=2)
     elif frame == "memobase":
         for _idx, msg in enumerate(session):
             messages.append(
@@ -28,8 +28,8 @@ def ingest_session(session, date, user_id, session_id, frame, client):
                     "created_at": date.isoformat(),
                 }
             )
-        client.add(messages, user_id)
-    elif frame == "memos-api":
+        client.add(messages, user_id, batch_size=2)
+    elif "memos-api" in frame:
         for msg in session:
             messages.append(
                 {
@@ -39,7 +39,7 @@ def ingest_session(session, date, user_id, session_id, frame, client):
                 }
             )
         if messages:
-            client.add(messages=messages, user_id=user_id, conv_id=session_id)
+            client.add(messages=messages, user_id=user_id, conv_id=session_id, batch_size=2)
     elif frame == "memu":
         for _idx, msg in enumerate(session):
             messages.append({"role": msg["role"], "content": msg["content"][:8000]})
@@ -80,6 +80,10 @@ def ingest_conv(lme_df, version, conv_idx, frame, success_records, f):
         from utils.client import MemosApiClient
 
         client = MemosApiClient()
+    elif frame == "memos-api-online":
+        from utils.client import MemosApiOnlineClient
+
+        client = MemosApiOnlineClient()
     elif frame == "memobase":
         from utils.client import MemobaseClient
 
@@ -167,7 +171,15 @@ def main(frame, version, num_workers=2):
     parser.add_argument(
         "--lib",
         type=str,
-        choices=["mem0", "mem0_graph", "memos-api", "memobase", "memu", "supermemory"],
+        choices=[
+            "mem0",
+            "mem0_graph",
+            "memos-api",
+            "memos-api-online",
+            "memobase",
+            "memu",
+            "supermemory",
+        ],
         default="memos-api",
     )
     parser.add_argument(
diff --git a/evaluation/scripts/longmemeval/lme_metric.py b/evaluation/scripts/longmemeval/lme_metric.py
@@ -258,7 +258,15 @@ def calculate_scores(data, grade_path, output_path):
     parser.add_argument(
         "--lib",
         type=str,
-        choices=["mem0", "mem0_graph", "memos-api", "memobase", "memu", "supermemory"],
+        choices=[
+            "mem0",
+            "mem0_graph",
+            "memos-api",
+            "memos-api-online",
+            "memobase",
+            "memu",
+            "supermemory",
+        ],
         default="memos-api",
     )
     parser.add_argument(
diff --git a/evaluation/scripts/longmemeval/lme_responses.py b/evaluation/scripts/longmemeval/lme_responses.py
@@ -132,7 +132,15 @@ def main(frame, version, num_workers=4):
     parser.add_argument(
         "--lib",
         type=str,
-        choices=["mem0", "mem0_graph", "memos-api", "memobase", "memu", "supermemory"],
+        choices=[
+            "mem0",
+            "mem0_graph",
+            "memos-api",
+            "memos-api-online",
+            "memobase",
+            "memu",
+            "supermemory",
+        ],
         default="memos-api",
     )
     parser.add_argument(
diff --git a/evaluation/scripts/longmemeval/lme_search.py b/evaluation/scripts/longmemeval/lme_search.py
@@ -123,6 +123,11 @@ def process_user(lme_df, conv_idx, frame, version, top_k=20):
 
         client = MemosApiClient()
         context, duration_ms = memos_search(client, question, user_id, top_k)
+    elif frame == "memos-api-online":
+        from utils.client import MemosApiOnlineClient
+
+        client = MemosApiOnlineClient()
+        context, duration_ms = memos_search(client, question, user_id, top_k)
     elif frame == "memu":
         from utils.client import MemuClient
 
@@ -218,7 +223,15 @@ def main(frame, version, top_k=20, num_workers=2):
     parser.add_argument(
         "--lib",
         type=str,
-        choices=["mem0", "mem0_graph", "memos-api", "memobase", "memu", "supermemory"],
+        choices=[
+            "mem0",
+            "mem0_graph",
+            "memos-api",
+            "memos-api-online",
+            "memobase",
+            "memu",
+            "supermemory",
+        ],
         default="memos-api",
     )
     parser.add_argument(
diff --git a/evaluation/scripts/utils/client.py b/evaluation/scripts/utils/client.py