Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Nov 12, 2025

Plan: Fix Example.toDict() serialization for dspy.History objects

Problem

Example.toDict() fails to serialize examples containing dspy.History objects because the convert_to_serializable function doesn't handle Pydantic models properly. This causes JSON serialization failures in Evaluate.save_as_json().

Minimal Changes Plan

  • Explore repository and understand the issue
  • Understand current toDict() implementation in Example class
  • Identify that History is a Pydantic model with model_dump() method
  • Run existing tests to establish baseline
  • Update Example.toDict() to handle Pydantic models
  • Update Evaluate._prepare_results_output() to use toDict()
  • Create test for Example with History serialization
  • Create comprehensive tests for save_as_json/csv with History
  • Run tests to verify fix
  • Verify JSON serialization works end-to-end
  • Fix linting issues
  • Run security scan with codeql_checker - NO VULNERABILITIES FOUND
  • Comprehensive manual verification
  • Address reviewer feedback - consolidated tests into existing file
  • Refactored to move serialization logic into merge_dicts
  • Use isinstance(BaseModel) instead of hasattr for type checking
  • Simplified to always use toDict() without fallback to dict()
  • Remove unnecessary try/except since pydantic is a core dependency

Changes Made

1. Updated Example.toDict() in dspy/primitives/example.py

Simplified the pydantic import since it's a core dependency:

from pydantic import BaseModel

# In convert_to_serializable:
elif isinstance(value, BaseModel):
    # Handle Pydantic models (e.g., dspy.History)
    return value.model_dump()

Previous approach: Used try/except with BaseModel = None fallback
Current approach: Direct import since pydantic>=2.0 is a required dependency in pyproject.toml

This is cleaner and more straightforward since pydantic is always available.

2. Updated merge_dicts() in dspy/evaluate/evaluate.py

Added automatic serialization when objects have toDict() method:

# Convert to dict if objects have toDict method (e.g., Example objects)
if hasattr(d1, "toDict"):
    d1 = d1.toDict()
if hasattr(d2, "toDict"):
    d2 = d2.toDict()

This keeps the serialization logic centralized in one place and makes _prepare_results_output() cleaner.

3. Updated _prepare_results_output() in dspy/evaluate/evaluate.py

Simplified to always use toDict() without conditional checks:

else example.toDict() | {"prediction": prediction, metric_name: score}

Since Example always has toDict() method, the conditional check was unnecessary.

4. Added comprehensive tests

  • test_example_to_dict_with_history in tests/primitives/test_example.py - Tests basic History serialization
  • test_evaluate_save_as_json_with_history in tests/evaluate/test_evaluate.py - Tests end-to-end JSON export
  • test_evaluate_save_as_csv_with_history in tests/evaluate/test_evaluate.py - Tests CSV export with History

Testing Results

✅ All 18 tests in tests/primitives/test_example.py pass
✅ All 34 tests in tests/evaluate/test_evaluate.py pass
✅ All linting checks pass (ruff)
✅ Comprehensive manual verification passed
✅ No security vulnerabilities found (CodeQL)

Security Summary

CodeQL analysis completed with 0 alerts for Python code. No security vulnerabilities were found in the changes.

Reviewer Feedback Addressed

  1. Consolidated History-related tests into existing test_evaluate.py file for better code organization
  2. Moved serialization logic to merge_dicts() function for cleaner, more maintainable code
  3. Changed from hasattr(value, "model_dump") to isinstance(value, BaseModel) for more explicit and Pythonic type checking
  4. Simplified _prepare_results_output() to always use toDict() without conditional fallback
  5. Removed unnecessary try/except for pydantic import since it's a core dependency of dspy
Original prompt

This section details on the original issue you should resolve

<issue_title>[Bug] Example.toDict fails to serialize examples with dspy.History</issue_title>
<issue_description>### What happened?

I tried to Json serialize an Example dictionary, but it failed. The Example.toDict does not convert the History to anything.
I tried this in the save_as_json method in Evaluate.

Steps to reproduce

use the save_as_json in the Evaluate function with an Example that contains a dspy.History object.

DSPy version

3.0.4</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Fix Example.toDict serialization issue with dspy.History fix(dspy): Example.toDict() fails to serialize dspy.History objects Nov 12, 2025
Copilot AI requested a review from TomeHirata November 12, 2025 04:20
Copilot finished work on behalf of TomeHirata November 12, 2025 04:20
@crawftv
Copy link
Contributor

crawftv commented Nov 12, 2025

The LLM beat me to it.

return [
(
merge_dicts(example, prediction) | {metric_name: score}
merge_dicts(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot do we still need this change after adding changes to Example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need both changes. While Example.toDict() now serializes History correctly, dict(example) and example.items() still return raw History objects. The original code used merge_dicts(example, prediction) which calls .items() internally, so it would still fail JSON serialization without this change.

Without the evaluate.py change: merge_dicts(example, prediction) → fails with "Object of type History is not JSON serializable"

With both changes: merge_dicts(example.toDict(), prediction) → succeeds because History is already serialized to dict

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot I see, can we change merge_dict to call toDict if needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in ebb502c. Moved the toDict() call into merge_dicts(), which now automatically serializes objects with toDict() method. This keeps the serialization logic centralized and makes _prepare_results_output() cleaner - it now passes Example objects directly to merge_dicts().

from dspy.utils.dummies import DummyLM


def test_evaluate_save_as_json_with_history():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Can we add tests to existing test_evaluate.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in c67ba33. Moved both History tests into test_evaluate.py and removed the separate test_evaluate_with_history.py file. All 34 tests pass.

Copilot finished work on behalf of TomeHirata November 13, 2025 04:17
@TomeHirata TomeHirata marked this pull request as ready for review November 14, 2025 06:38
Copilot finished work on behalf of TomeHirata November 14, 2025 06:46
def convert_to_serializable(value):
if hasattr(value, "toDict"):
return value.toDict()
elif hasattr(value, "model_dump"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot can we use isinstance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 2a82f8d. Changed from hasattr(value, "model_dump") to isinstance(value, BaseModel) for more explicit type checking. Also added safe import handling with BaseModel = None fallback if pydantic is not available.

merge_dicts(example, prediction) | {metric_name: score}
if prediction_is_dictlike(prediction)
else dict(example) | {"prediction": prediction, metric_name: score}
else (example.toDict() if hasattr(example, "toDict") else dict(example)) | {"prediction": prediction, metric_name: score}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot is it possible to call toDict when dict() is called?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 1d56c29. Simplified the code to always use example.toDict() instead of the conditional (example.toDict() if hasattr(example, "toDict") else dict(example)). Since Example always has the toDict() method, the conditional was unnecessary. This ensures consistent serialization behavior and makes the code cleaner.

Copilot finished work on behalf of TomeHirata November 14, 2025 07:03
Copilot AI requested a review from TomeHirata November 14, 2025 07:03
Copilot finished work on behalf of TomeHirata November 14, 2025 07:10
@@ -1,3 +1,9 @@
try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot pydantic is a core dependency of dspy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 9ed984c. Removed the try/except wrapper and directly import BaseModel since pydantic>=2.0 is a required dependency. Also removed the BaseModel is not None check in the isinstance condition.

Copilot AI requested a review from TomeHirata November 17, 2025 07:27
Copilot finished work on behalf of TomeHirata November 17, 2025 07:27
Copy link
Collaborator

@TomeHirata TomeHirata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@TomeHirata TomeHirata merged commit a5671ef into main Nov 17, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Example.toDict fails to serialize examples with dspy.History

3 participants