fix(dspy): Example.toDict() fails to serialize dspy.History objects #9047

Copilot · 2025-11-12T04:01:38Z

Plan: Fix Example.toDict() serialization for dspy.History objects

Problem

Example.toDict() fails to serialize examples containing dspy.History objects because the convert_to_serializable function doesn't handle Pydantic models properly. This causes JSON serialization failures in Evaluate.save_as_json().

Minimal Changes Plan

Changes Made

1. Updated `Example.toDict()` in `dspy/primitives/example.py`

Simplified the pydantic import since it's a core dependency:

from pydantic import BaseModel

# In convert_to_serializable:
elif isinstance(value, BaseModel):
    # Handle Pydantic models (e.g., dspy.History)
    return value.model_dump()

Previous approach: Used try/except with BaseModel = None fallback
Current approach: Direct import since pydantic>=2.0 is a required dependency in pyproject.toml

This is cleaner and more straightforward since pydantic is always available.

2. Updated `merge_dicts()` in `dspy/evaluate/evaluate.py`

Added automatic serialization when objects have toDict() method:

# Convert to dict if objects have toDict method (e.g., Example objects)
if hasattr(d1, "toDict"):
    d1 = d1.toDict()
if hasattr(d2, "toDict"):
    d2 = d2.toDict()

This keeps the serialization logic centralized in one place and makes _prepare_results_output() cleaner.

3. Updated `_prepare_results_output()` in `dspy/evaluate/evaluate.py`

Simplified to always use toDict() without conditional checks:

else example.toDict() | {"prediction": prediction, metric_name: score}

Since Example always has toDict() method, the conditional check was unnecessary.

4. Added comprehensive tests

test_example_to_dict_with_history in tests/primitives/test_example.py - Tests basic History serialization
test_evaluate_save_as_json_with_history in tests/evaluate/test_evaluate.py - Tests end-to-end JSON export
test_evaluate_save_as_csv_with_history in tests/evaluate/test_evaluate.py - Tests CSV export with History

Testing Results

✅ All 18 tests in tests/primitives/test_example.py pass
✅ All 34 tests in tests/evaluate/test_evaluate.py pass
✅ All linting checks pass (ruff)
✅ Comprehensive manual verification passed
✅ No security vulnerabilities found (CodeQL)

Security Summary

CodeQL analysis completed with 0 alerts for Python code. No security vulnerabilities were found in the changes.

Reviewer Feedback Addressed

Consolidated History-related tests into existing test_evaluate.py file for better code organization
Moved serialization logic to merge_dicts() function for cleaner, more maintainable code
Changed from hasattr(value, "model_dump") to isinstance(value, BaseModel) for more explicit and Pythonic type checking
Simplified _prepare_results_output() to always use toDict() without conditional fallback
Removed unnecessary try/except for pydantic import since it's a core dependency of dspy

Original prompt

This section details on the original issue you should resolve

<issue_title>[Bug] Example.toDict fails to serialize examples with dspy.History</issue_title>
<issue_description>### What happened?

I tried to Json serialize an Example dictionary, but it failed. The Example.toDict does not convert the History to anything.
I tried this in the save_as_json method in Evaluate.

Steps to reproduce

use the save_as_json in the Evaluate function with an Example that contains a dspy.History object.

DSPy version

3.0.4</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes [Bug] Example.toDict fails to serialize examples with dspy.History #9045

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: TomeHirata <[email protected]>

crawftv · 2025-11-12T05:36:34Z

The LLM beat me to it.

TomeHirata · 2025-11-13T04:06:23Z

dspy/evaluate/evaluate.py

        return [
            (
-                merge_dicts(example, prediction) | {metric_name: score}
+                merge_dicts(


@copilot do we still need this change after adding changes to Example?

Yes, we need both changes. While Example.toDict() now serializes History correctly, dict(example) and example.items() still return raw History objects. The original code used merge_dicts(example, prediction) which calls .items() internally, so it would still fail JSON serialization without this change.

Without the evaluate.py change: merge_dicts(example, prediction) → fails with "Object of type History is not JSON serializable"

With both changes: merge_dicts(example.toDict(), prediction) → succeeds because History is already serialized to dict

@copilot I see, can we change merge_dict to call toDict if needed?

Done in ebb502c. Moved the toDict() call into merge_dicts(), which now automatically serializes objects with toDict() method. This keeps the serialization logic centralized and makes _prepare_results_output() cleaner - it now passes Example objects directly to merge_dicts().

TomeHirata · 2025-11-13T04:06:39Z

tests/evaluate/test_evaluate_with_history.py

+from dspy.utils.dummies import DummyLM
+
+
+def test_evaluate_save_as_json_with_history():


@copilot Can we add tests to existing test_evaluate.py?

Done in c67ba33. Moved both History tests into test_evaluate.py and removed the separate test_evaluate_with_history.py file. All 34 tests pass.

Co-authored-by: TomeHirata <[email protected]>

TomeHirata · 2025-11-14T06:56:41Z

dspy/primitives/example.py

        def convert_to_serializable(value):
            if hasattr(value, "toDict"):
                return value.toDict()
+            elif hasattr(value, "model_dump"):


@copilot can we use isinstance?

Done in 2a82f8d. Changed from hasattr(value, "model_dump") to isinstance(value, BaseModel) for more explicit type checking. Also added safe import handling with BaseModel = None fallback if pydantic is not available.

TomeHirata · 2025-11-14T06:58:13Z

dspy/evaluate/evaluate.py

                merge_dicts(example, prediction) | {metric_name: score}
                if prediction_is_dictlike(prediction)
-                else dict(example) | {"prediction": prediction, metric_name: score}
+                else (example.toDict() if hasattr(example, "toDict") else dict(example)) | {"prediction": prediction, metric_name: score}


@copilot is it possible to call toDict when dict() is called?

Done in 1d56c29. Simplified the code to always use example.toDict() instead of the conditional (example.toDict() if hasattr(example, "toDict") else dict(example)). Since Example always has the toDict() method, the conditional was unnecessary. This ensures consistent serialization behavior and makes the code cleaner.

Co-authored-by: TomeHirata <[email protected]>

TomeHirata · 2025-11-17T07:21:25Z

dspy/primitives/example.py

@@ -1,3 +1,9 @@
+try:


@copilot pydantic is a core dependency of dspy

Done in 9ed984c. Removed the try/except wrapper and directly import BaseModel since pydantic>=2.0 is a required dependency. Also removed the BaseModel is not None check in the isinstance condition.

…ependency Co-authored-by: TomeHirata <[email protected]>

TomeHirata

LGTM

Initial plan

597a7b4

Copilot AI assigned Copilot and TomeHirata Nov 12, 2025

Copilot started work on behalf of TomeHirata November 12, 2025 04:01 View session

Fix Example.toDict() to serialize dspy.History objects properly

e79dc12

Co-authored-by: TomeHirata <[email protected]>

Copilot AI changed the title ~~[WIP] Fix Example.toDict serialization issue with dspy.History~~ fix(dspy): Example.toDict() fails to serialize dspy.History objects Nov 12, 2025

Copilot AI requested a review from TomeHirata November 12, 2025 04:20

Copilot finished work on behalf of TomeHirata November 12, 2025 04:20

TomeHirata reviewed Nov 13, 2025

View reviewed changes

Copilot started work on behalf of TomeHirata November 13, 2025 04:06 View session

Copilot AI requested a review from TomeHirata November 13, 2025 04:11

Copilot finished work on behalf of TomeHirata November 13, 2025 04:11

Copilot started work on behalf of TomeHirata November 13, 2025 04:11 View session

Consolidate History tests into existing test_evaluate.py

c67ba33

Co-authored-by: TomeHirata <[email protected]>

Copilot finished work on behalf of TomeHirata November 13, 2025 04:17

TomeHirata marked this pull request as ready for review November 14, 2025 06:38

Copilot started work on behalf of TomeHirata November 14, 2025 06:38 View session

Refactor: move toDict() call to merge_dicts for cleaner code

ebb502c

Co-authored-by: TomeHirata <[email protected]>

Copilot finished work on behalf of TomeHirata November 14, 2025 06:46

TomeHirata reviewed Nov 14, 2025

View reviewed changes

Copilot started work on behalf of TomeHirata November 14, 2025 06:57 View session

TomeHirata reviewed Nov 14, 2025

View reviewed changes

Use isinstance(BaseModel) instead of hasattr for Pydantic check

2a82f8d

Co-authored-by: TomeHirata <[email protected]>

Copilot finished work on behalf of TomeHirata November 14, 2025 07:03

Copilot AI requested a review from TomeHirata November 14, 2025 07:03

Copilot started work on behalf of TomeHirata November 14, 2025 07:03 View session

Simplify _prepare_results_output to always use toDict()

1d56c29

Co-authored-by: TomeHirata <[email protected]>

Copilot finished work on behalf of TomeHirata November 14, 2025 07:10

TomeHirata reviewed Nov 17, 2025

View reviewed changes

Copilot started work on behalf of TomeHirata November 17, 2025 07:21 View session

Remove unnecessary try/except for pydantic import since it's a core d…

9ed984c

…ependency Co-authored-by: TomeHirata <[email protected]>

Copilot AI requested a review from TomeHirata November 17, 2025 07:27

Copilot finished work on behalf of TomeHirata November 17, 2025 07:27

TomeHirata approved these changes Nov 17, 2025

View reviewed changes

TomeHirata merged commit a5671ef into main Nov 17, 2025
24 checks passed

		from dspy.utils.dummies import DummyLM


		def test_evaluate_save_as_json_with_history():

fix(dspy): Example.toDict() fails to serialize dspy.History objects #9047

fix(dspy): Example.toDict() fails to serialize dspy.History objects #9047

Conversation

Copilot AI commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Plan: Fix Example.toDict() serialization for dspy.History objects

Problem

Minimal Changes Plan

Changes Made

1. Updated Example.toDict() in dspy/primitives/example.py

2. Updated merge_dicts() in dspy/evaluate/evaluate.py

3. Updated _prepare_results_output() in dspy/evaluate/evaluate.py

4. Added comprehensive tests

Testing Results

Security Summary

Reviewer Feedback Addressed

Steps to reproduce

DSPy version

Comments on the Issue (you are @copilot in this section)

Uh oh!

crawftv commented Nov 12, 2025

Uh oh!

TomeHirata Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

TomeHirata Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

TomeHirata Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

TomeHirata Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

TomeHirata Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

TomeHirata Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

TomeHirata left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Nov 12, 2025 •

edited

Loading

1. Updated `Example.toDict()` in `dspy/primitives/example.py`

2. Updated `merge_dicts()` in `dspy/evaluate/evaluate.py`

3. Updated `_prepare_results_output()` in `dspy/evaluate/evaluate.py`