Skip to content

Commit a22574d

Browse files
authored
add observability doc (#1878)
Signed-off-by: B-Step62 <[email protected]>
1 parent 1b4bd64 commit a22574d

File tree

4 files changed

+188
-0
lines changed

4 files changed

+188
-0
lines changed

docs/docs/tutorials/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,5 +6,6 @@
66

77
* [Deployment](/tutorials/deployment/)
88

9+
* [Debugging & Observability](/tutorials/observability/)
910

1011
We are working on upgrading more tutorials and other examples to [DSPy 2.5](https://github.com/stanfordnlp/dspy/blob/main/examples/migration.ipynb) from earlier DSPy versions.
20 MB
Loading
Lines changed: 186 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,186 @@
1+
# Tutorial: Debugging and Observability in DSPy
2+
3+
This guide demonstrates how to debug problems and improve observability in DSPy. Modern AI programs often involve multiple components, such as language models, retrievers, and tools. DSPy allows you to build nad optimize such complex AI systems in a clean and modular way.
4+
5+
However, as systems grow more sophisticated, the ability to **understand what your system is doing** becomes critical. Without transparency, the prediction process can easily become a black box, making failures or quality issues difficult to diagnose and production maintenance challenging.
6+
7+
By the end of this tutorial, you'll understand how to debug an issue and improve observability using [MLflow Tracing](#mlflow-tracing). You'll also explore how to build a custom logging solution using callbacks.
8+
9+
10+
11+
## Define a Program
12+
13+
We'll start by creating a simple ReAct agent that uses ColBERTv2's Wikipedia dataset as a retrieval source. You can replace this with a more sophisticated program.
14+
15+
```python
16+
import dspy
17+
from dspy.datasets import HotPotQA
18+
19+
lm = dspy.LM('openai/gpt-4o-mini')
20+
colbert = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
21+
dspy.configure(lm=lm, rm=colbert)
22+
23+
agent = dspy.ReAct("question -> answer", tools=[dspy.Retrieve(k=1)])
24+
```
25+
26+
Now, let's ask the agent a simple question:
27+
28+
```python
29+
prediction = agent(question="Which baseball team does Shohei Ohtani play for?")
30+
print(prediction.answer)
31+
```
32+
33+
```
34+
Shohei Ohtani plays for the Los Angeles Angels.
35+
```
36+
37+
Oh, this is incorrect. He no longer plays for the Angels; he moved to the Dodgers and won the World Series in 2024! Let's debug the program and explore potential fixes.
38+
39+
## Using ``inspect_history``
40+
41+
DSPy provides the `inspect_history()` utility, which prints out all LLM invocations made so far:
42+
43+
```python
44+
# Print out 5 LLM calls
45+
dspy.inspect_history(n=5)
46+
```
47+
48+
```
49+
[2024-12-01T10:23:29.144257]
50+
51+
System message:
52+
53+
Your input fields are:
54+
1. `question` (str)
55+
56+
...
57+
58+
Response:
59+
60+
[[ ## Thought_5 ## ]]
61+
The search results continue to be unhelpful and do not provide the current team for Shohei Ohtani in Major League Baseball. I need to conclude that he plays for the Los Angeles Angels based on prior knowledge, as the searches have not yielded updated information.
62+
63+
[[ ## Action_5 ## ]]
64+
Finish[Los Angeles Angels]
65+
66+
[[ ## completed ## ]]
67+
```
68+
The log reveals that the agent could not retrieve helpful information from the search tool. However, what exactly did the retriever return? While useful, `inspect_history` has some limitations:
69+
70+
* In real-world systems, other components like retrievers, tools, and custom modules play significant roles, but `inspect_history` only logs LLM calls.
71+
* DSPy programs often make multiple LLM calls within a single prediction. Monolith log history makes it hard to organize logs, especially when handling multiple questions.
72+
* Metadata such as parameters, latency, and the relationship between modules are not captured.
73+
74+
**Tracing** addresses these limitations and provides a more comprehensive solution.
75+
76+
## Tracing
77+
78+
[MLflow](https://mlflow.org/docs/latest/llms/tracing/index.html) is an end-to-end machine learning platform that is integrated seamlessly with DSPy to support best practices in LLMOps. Using MLflow's automatic tracing capability with DSPy is straightforward; **No sign up for services or an API key is required**. You just need to install MLflow and call `mlflow.dspy.autolog()` in your notebook or script.
79+
80+
```bash
81+
pip install -U mlflow>=2.18.0
82+
```
83+
84+
```python
85+
import mlflow
86+
87+
mlflow.dspy.autolog()
88+
89+
# This is optional. Create an MLflow Experiment to store and organize your traces.
90+
mlflow.set_experiment("DSPy")
91+
```
92+
93+
Now you're all set! Let's run your agent again:
94+
95+
```python
96+
agent(question="Which baseball team does Shohei Ohtani play for?")
97+
```
98+
99+
MLflow automatically generates a **trace** for the prediction and records it in the experiment. To explore traces visually, launch the MLflow UI by the following command and access it in your browser:
100+
101+
```bash
102+
mlflow ui --port 5000
103+
```
104+
105+
![DSPy MLflow Tracing](./dspy-tracing.gif)
106+
107+
From the retriever step output, you can observe that it returned outdated information; indicating Shohei Ohtani was still playing in the Japanese league and the final answer was based on the LLM's prior knowledge! We should update the dataset or add additional tools to ensure access to the latest information.
108+
109+
!!! info Learn more about MLflow
110+
111+
MLflow is an end-to-end LLMOps platform that offers extensive features like experiment tracking, evaluation, and deployment. To learn more about DSPy and MLflow integration, visit [this tutorial](../deployment/#deploying-with-mlflow).
112+
113+
For example, we can add a web search capability to the agent, using the [Tavily](https://tavily.com/) web search API.
114+
115+
```python
116+
from dspy.predict.react import Tool
117+
from tavily import TavilyClient
118+
119+
search_client = TavilyClient(api_key="<YOUR_TAVILY_API_KEY>")
120+
121+
def web_search(query: str) -> list[str]:
122+
"""Run a web search and return the content from the top 5 search results"""
123+
response = search_client.search(query)
124+
return [r["content"] for r in response["results"]]
125+
126+
agent = dspy.ReAct("question -> answer", tools=[Tool(web_search)])
127+
128+
prediction = agent(question="Which baseball team does Shohei Ohtani play for?")
129+
print(agent.answer)
130+
```
131+
132+
```
133+
Los Angeles Dodgers
134+
```
135+
136+
137+
## Building a Custom Logging Solution
138+
139+
Sometimes, you may want to implement a custom logging solution. For instance, you might need to log specific events triggered by a particular module. DSPy's **callback** mechanism supports such use cases. The ``BaseCallback`` class provides several handlers for customizing logging behavior:
140+
141+
|Handlers|Description|
142+
|:--|:--|
143+
|`on_module_start` / `on_module_end` | Triggered when a `dspy.Module` subclass is invoked. |
144+
|`on_lm_start` / `on_lm_end` | Triggered when a `dspy.LM` subclass is invoked. |
145+
|`on_adapter_format_start` / `on_adapter_format_end`| Triggered when a `dspy.Adapter` subclass formats the input prompt. |
146+
|`on_adapter_parse_start` / `on_adapter_parse_end`| Triggered when a `dspy.Adapter` subclass postprocess the output text from an LM. |
147+
148+
Here’s an example of custom callback that logs the intermediate steps of a ReAct agent:
149+
150+
```python
151+
import dspy
152+
from dspy.utils.callback import BaseCallback
153+
154+
# 1. Define a custom callback class that extends BaseCallback class
155+
class AgentLoggingCallback(BaseCallback):
156+
157+
# 2. Implement on_module_end handler to run a custom logging code.
158+
def on_module_end(self, call_id, outputs, exception):
159+
step = "Reasoning" if self._is_reasoning_output(outputs) else "Acting"
160+
print(f"== {step} Step ===")
161+
for k, v in outputs.items():
162+
print(f" {k}: {v}")
163+
print("\n")
164+
165+
def _is_reasoning_output(self, outputs):
166+
return any(k.startswith("Thought") for k in outputs.keys())
167+
168+
# 3. Set the callback to DSPy setting so it will be applied to program execution
169+
dspy.configure(callbacks=[AgentLoggingCallback()])
170+
```
171+
172+
173+
```
174+
== Reasoning Step ===
175+
Thought_1: I need to find the current team that Shohei Ohtani plays for in Major League Baseball.
176+
Action_1: Search[Shohei Ohtani current team 2023]
177+
178+
== Acting Step ===
179+
passages: ["Shohei Otani ..."]
180+
181+
...
182+
```
183+
184+
!!! info Handling Inputs and Outputs in Callbacks
185+
186+
Be cautious when working with input or output data in callbacks. Mutating them in-place can modify the original data passed to the program, potentially leading to unexpected behavior. To avoid this, it’s strongly recommended to create a copy of the data before performing any operations that may alter it.

docs/mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ nav:
4848
- Math Reasoning: tutorials/math/index.ipynb
4949
- Entity Extraction: tutorials/entity_extraction/index.ipynb
5050
- Deployment: tutorials/deployment/index.md
51+
- Debugging & Observability: tutorials/observability/index.md
5152
- Community:
5253
- Community Resources: community/community-resources.md
5354
- Use Cases: community/use-cases.md

0 commit comments

Comments
 (0)