Skip to content

Commit bcfabe8

Browse files
claudiosvstarpitmandeljgchnvazirim
authored
PDL Optimizer (#941)
* feat: remove tauri cli support for running python interpreter This also updates the tauri cli to rename runr -> run (run previously invoked the python interpreter). And adds stub support for --data, --data-file, --trace command line options (not implemented yet). Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * chore: bump rust dependencies to resolve alert Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * test: tauri github actions test should apt install the deb This also removes the beeai compiler test in that workflow, as we now cover that in the core rust interpreter tests. Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * feat: support for --data and --data-file in rust interpreter Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * Add optimizer module Signed-off-by: Claudio Spiess <[email protected]> * Remove BAM Signed-off-by: Claudio Spiess <[email protected]> * Initial prompt library Signed-off-by: Claudio Spiess <[email protected]> * Clean up & bring in changes Signed-off-by: Claudio Spiess <[email protected]> * AST dump fixes Signed-off-by: Claudio Spiess <[email protected]> * Documentation and fixes Signed-off-by: Claudio Spiess <[email protected]> * Finish prompt lib docs Signed-off-by: Claudio Spiess <[email protected]> * Address feedback Signed-off-by: Claudio Spiess <[email protected]> * Lint Signed-off-by: Claudio Spiess <[email protected]> * fix: begin phasing in Metadata (common defs, etc. attrs) into rust AST Rather than duplicating this "metadata" logic -- the stuff common to all non-literal blocks. This just starts the migration. The main trick is to use the serde `flatten` capability, so that we can maintain a separate Metadata struct in Rust, but have it flattened into the enclosing object for serde. Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * chore: update granite-io dependency (#896) Signed-off-by: Louis Mandel <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * refactor: move def attr into Metadata (rust interpreter) Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * feat: introduce Expr typing and apply it to IfBlock.condition Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * feat: update rust Call AST to use Expr for condition attr Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * chore: bump to rust 2024 edition Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * feat: continue to flesh out block metadata structure in rust Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * refactor: add metadata attr to remaining rust block asts This also adds initial scaffolding for timing, and adds a ModelBlock and ArrayBlockBuilder. Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * feat: update rust Repeat AST to use Expr for `for` attr (#904) Signed-off-by: Louis Mandel <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * refactor: introduce Advanced enum to rust AST Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * refactor: refactor rust ast to place metadata in common struct And start populating the timing info (incomplete). Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * fix: improve deserialization of python-generated model block traces Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * fix: in rust ast, allow ModelBlock model to be an expr Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * feat: initial pdl__id and --trace support for rust interpreter Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * fix: update rust interpreter to create Data blocks for expr eval, and model_input trace field Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * fix: populate trace context field in rust interpreter Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * Update stop sequences in parameters (#861) * Stop sequence parameter update to align with ollama Signed-off-by: Jing Chen <[email protected]> * Update results Signed-off-by: Jing Chen <[email protected]> --------- Signed-off-by: Jing Chen <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * refactor: extract platform-generic logic from run_ollama_model() handler Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * fix: rust interpreter was not handling pdl__context for re-runs of traces This only covers the run_model code path, but it's a start. Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * feat: improve support for importing stdlib in python code blocks Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * skeleton-of-thought example (#919) Signed-off-by: Mandana Vaziri <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * Bump litellm and openai versions (#920) Signed-off-by: Mandana Vaziri <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * fix: improve support for rust interpreter python imports from venv Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * chore: bump tauri and npm dependencies And leverage the new prevent overflow feature of tauri. Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * chore: bump ui to 0.6.1 (#921) Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * fix: rust ast support for gsm8k, including jsonl parser The program may not run 100% yet, but it parses now. Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * chore: bump rust dependencies Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * feat: improve support for tool calling in ollama-rs We need to send a system prompt to some models. Plus add some stubs for pydantic for python tool execution. This also includes nascent support for openai-rs (not fully funcitonal yet). Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * Fully qualify import (#930) Signed-off-by: Ed Snible <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * feat: some regex parser support for rust interpreter Split mode TODO! Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * Change to sys.path for python code block (#931) Signed-off-by: Mandana Vaziri <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * granite-io hallucination demo example and notebook (#932) Signed-off-by: Mandana Vaziri <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * Lint Signed-off-by: Claudio Spiess <[email protected]> * Refactor Signed-off-by: Claudio Spiess <[email protected]> * Tests Signed-off-by: Claudio Spiess <[email protected]> * Formatting Signed-off-by: Claudio Spiess <[email protected]> * Lint Signed-off-by: Claudio Spiess <[email protected]> * Skip tests for now Signed-off-by: Claudio Spiess <[email protected]> * Update schema Signed-off-by: Claudio Spiess <[email protected]> * Add contrib. prompt library (#927) * Initial prompt library Signed-off-by: Claudio Spiess <[email protected]> * Documentation and fixes Signed-off-by: Claudio Spiess <[email protected]> * Finish prompt lib docs Signed-off-by: Claudio Spiess <[email protected]> * Address feedback Signed-off-by: Claudio Spiess <[email protected]> --------- Signed-off-by: Claudio Spiess <[email protected]> * Fixed the bug where pdl.__version__ was not set (#882) It wasn't set previously because importlib searches for the distribution name and not the module's top-level name. Fallbacks are in place to revert to searching for 'pdl', and if that fails, it fallsback to the hardcoded version in _version.py Signed-off-by: Abi Ullattil <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * chore: update pre-commit tools (#937) Signed-off-by: Louis Mandel <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * Use granite-io async interface (#936) Signed-off-by: Louis Mandel <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * Lint Signed-off-by: Claudio Spiess <[email protected]> * chore: bump ui dependences Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * fix: skip failing execution tests (#938) Signed-off-by: Louis Mandel <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * independent implementation (#934) * independent implementation Signed-off-by: Mandana Vaziri <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * feat: add a `parse_dict` function to `pdl_parser` (#943) Signed-off-by: Louis Mandel <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> * Address feedback 1 Signed-off-by: Claudio Spiess <[email protected]> * Address feedback 2 Signed-off-by: Claudio Spiess <[email protected]> * Lint & fix mypy module warning Signed-off-by: Claudio Spiess <[email protected]> * Skip tests again Signed-off-by: Claudio Spiess <[email protected]> * Try to resolve wikipedia package in ci Signed-off-by: Claudio Spiess <[email protected]> * Fix rebase Signed-off-by: Claudio Spiess <[email protected]> * Fix pyproject Signed-off-by: Claudio Spiess <[email protected]> * Move multiprocess to optional Signed-off-by: Claudio Spiess <[email protected]> * Add ipython to pdl-live Signed-off-by: Claudio Spiess <[email protected]> * Fix trace metadata Signed-off-by: Claudio Spiess <[email protected]> * Add documentation Signed-off-by: Claudio Spiess <[email protected]> * Add optimizer test PDL Signed-off-by: Claudio Spiess <[email protected]> * Add config and expand docs Signed-off-by: Claudio Spiess <[email protected]> --------- Signed-off-by: Nick Mitchell <[email protected]> Signed-off-by: Claudio Spiess <[email protected]> Signed-off-by: Louis Mandel <[email protected]> Signed-off-by: Jing Chen <[email protected]> Signed-off-by: Mandana Vaziri <[email protected]> Signed-off-by: Ed Snible <[email protected]> Signed-off-by: Abi Ullattil <[email protected]> Co-authored-by: Nick Mitchell <[email protected]> Co-authored-by: Louis Mandel <[email protected]> Co-authored-by: Jing Chen <[email protected]> Co-authored-by: Mandana Vaziri <[email protected]> Co-authored-by: Nick Mitchell <[email protected]> Co-authored-by: Ed Snible <[email protected]> Co-authored-by: Abi Ullattil <[email protected]>
1 parent 01be02e commit bcfabe8

35 files changed

+3672
-49
lines changed

.pre-commit-config.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@ repos:
5656
rev: 'v1.15.0'
5757
hooks:
5858
- id: mypy
59+
args: [--explicit-package-bases]
5960
verbose: true
6061
additional_dependencies: ['types-PyYAML']
6162
# type check the Python code using pyright

contrib/prompt_library/ReAct.pdl

Lines changed: 24 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -10,26 +10,28 @@ defs:
1010
trajectory: ${ trajectory }
1111
repeat:
1212
text:
13-
- def: type
14-
text: ${ trajectory.keys()|first }
15-
contribute: []
16-
- if: ${ type == 'question'}
17-
then: |
18-
Question: ${ trajectory[type]|trim }
19-
- if: ${ type == 'task'}
20-
then: |
21-
Task: ${ trajectory[type]|trim }
22-
- if: ${ type == 'thought'}
23-
then: |
24-
Tho: ${ trajectory[type]|trim }
25-
- if: ${ type == 'action'}
26-
then: |
27-
Act: ${ trajectory[type]|trim }
28-
- if: ${ type == 'observation'}
29-
then: |
30-
Obs: ${ trajectory[type]|trim }
31-
- if: ${ type not in ['question', 'task', 'thought', 'action', 'observation'] }
32-
then: "${ type }: ${ trajectory[type]|trim }"
13+
- defs:
14+
type:
15+
text: ${ trajectory.keys()|first }
16+
- match: ${ type }
17+
with:
18+
- case: question
19+
then: |
20+
Question: ${ trajectory[type]|trim }
21+
- case: task
22+
then: |
23+
Task: ${ trajectory[type]|trim }
24+
- case: thought
25+
then: |
26+
Tho: ${ trajectory[type]|trim }
27+
- case: action
28+
then: |
29+
Act: ${ trajectory[type]|trim }
30+
- case: observation
31+
then: |
32+
Obs: ${ trajectory[type]|trim }
33+
- if: ${ type not in ['question', 'task', 'thought', 'action', 'observation'] }
34+
then: "${ type }: ${ trajectory[type]|trim }"
3335
- "\n"
3436

3537
react:
@@ -101,13 +103,12 @@ defs:
101103
then:
102104
text:
103105
- "\nObs: "
104-
- if: ${ action.name in tools }
106+
- if: ${ action.name.lower() in tools }
105107
then:
106-
call: ${ tools[action.name] }
108+
call: ${ tools[action.name.lower()] }
107109
args:
108110
arguments: ${ action.arguments }
109111
else: "Invalid action. Valid actions are ${ tool_names[:-1]|join(', ') }, and ${ tool_names[-1] }."
110-
# - "\n"
111112
else:
112113
def: exit
113114
contribute: []

contrib/prompt_library/ReWoo.pdl

Lines changed: 24 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -20,24 +20,29 @@ defs:
2020
text: ${ trajectory.keys()|first }
2121
content:
2222
text: ${ trajectory.values()|first }
23-
- if: ${ type in ['task', 'question'] }
24-
then: |-
25-
Task: ${ content|trim }
26-
- if: ${ type == 'thought'}
27-
then: |-
23+
- match: ${ type }
24+
with:
25+
- case: task
26+
then: |-
27+
Task: ${ content|trim }
28+
- case: question
29+
then: |-
30+
Task: ${ content|trim }
31+
- case: thought
32+
then: |-
2833

29-
Plan: ${ content|trim }
30-
- if: ${ type == 'action'}
31-
then:
32-
text:
33-
- " #E${ i } = ${ content|trim }"
34-
- defs:
35-
i:
36-
data: ${ i+1 }
37-
- if: ${ type == 'observation'}
38-
then: ""
39-
- if: ${ type not in ['question', 'task', 'thought', 'action', 'observation'] }
40-
then: "${ type }: ${ content|trim }\n"
34+
Plan: ${ content|trim }
35+
- case: action
36+
then:
37+
text:
38+
- " #E${ i } = ${ content|trim }"
39+
- defs:
40+
i:
41+
data: ${ i+1 }
42+
- case: observation
43+
then: ""
44+
- if: ${ type not in ['question', 'task', 'thought', 'action', 'observation'] }
45+
then: "${ type }: ${ content|trim }\n"
4146
- "\n"
4247

4348
rewoo:
@@ -120,9 +125,9 @@ defs:
120125
ACTION_RAW = ACTION_RAW.replace(k, v)
121126
result = ACTION_RAW
122127
tool_output:
123-
if: ${ ACTION.name in tools }
128+
if: ${ ACTION.name.lower() in tools }
124129
then:
125-
call: ${ tools[ACTION.name] }
130+
call: ${ tools[ACTION.name.lower()] }
126131
args:
127132
arguments: ${ ACTION.arguments }
128133
else: "Invalid action. Valid actions are ${ tools.keys() }"

docs/autopdl.md

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
---
2+
hide:
3+
- navigation
4+
- toc
5+
- footer
6+
---
7+
8+
# AutoPDL Tutorial
9+
10+
The following sections show how to use the AutoPDL optimizer to produce optimized PDL programs for specific tasks.
11+
12+
To optimize a PDL program, we need the program, an optimizer configuration, a dataset, and an _evaluator_. An evaluator is a Python subclass of `OptimizerEvaluator` that evaluates a candidate, which is a generated configuration instance consisting of e.g. fewshot examples. The evaluator class follows this structure:
13+
14+
```python title="src/pdl/optimize/optimizer_evaluator.py" linenums="1"
15+
class OptimizerEvaluator(Thread):
16+
"""Evaluates a candidate (configuration, i.e. fewshots, style) against **one** test example."""
17+
18+
def __init__(
19+
self,
20+
pdl_program: Program,
21+
example: dict,
22+
candidate: dict,
23+
index: int,
24+
timeout: int,
25+
yield_output: bool,
26+
config: OptimizationConfig,
27+
cwd: Path,
28+
answer_key: str = "answer",
29+
) -> None:
30+
super().__init__()
31+
self.pdl_program = pdl_program
32+
...
33+
34+
def get_scope(self) -> ScopeType:
35+
"""
36+
Constructs a PDL scope for the candidate,
37+
can take self.candidate and self.config into account
38+
"""
39+
40+
def extract_answer(self, document: str) -> Any:
41+
"""
42+
Extracts the final answer from the PDL result document,
43+
i.e. the string the PDL program returns
44+
"""
45+
46+
def answer_correct(self, document: str, answer: Any, truth: Any) -> bool:
47+
"""
48+
Checks the extracted answer against the groundtruth value,
49+
in self.example[self.answer_key]
50+
"""
51+
```
52+
53+
Let's go through an example for `GSM8K`. Our PDL program uses different prompt patterns from the prompt library, and the variables `prompt_pattern`, `question`, `model`, and `demonstrations` are inserted at runtime by the evaluator.
54+
55+
56+
```yaml title="examples/optimizer/gsm8k.pdl" linenums="1"
57+
--8<-- "./examples/optimizer/gsm8k.pdl"
58+
```
59+
60+
We write a configuration file for the optimizer, see `src/pdl/optimize/config_parser.py` for all fields:
61+
62+
``` { .yaml .copy .annotate title="gsm8k_optimizer_config.yml" linenums="1" }
63+
benchmark: gsm8k # Name our benchmark
64+
budget: null # Set a budget, can be number of iterations, or a duration string e.g. "2h"
65+
budget_growth: double # double validation set size each iteration
66+
# or to_max: reach max_test_set_size by final iteration
67+
initial_test_set_size: 2 # size of test set in first iteration
68+
max_test_set_size: 10 # maximum test set size
69+
num_candidates: 100 # how many candidates to evaluate
70+
num_demonstrations: 5 # how many demonstrations to include per candidate
71+
parallelism: 1 # how many threads to run evaluations across
72+
shuffle_test: false # shuffling of test set
73+
test_set_name: test # name of test set
74+
train_set_name: train # name of train set
75+
validation_set_name: validation # name of validation set
76+
demonstrations_variable_name: demonstrations # variable name to insert demonstrations into
77+
variables: # define discrete options to sample from
78+
model: # set ${ model } variable
79+
- watsonx/meta-llama/llama-3-1-8b-instruct
80+
prompt_pattern: # set ${ prompt_pattern } variable to one of these
81+
- cot
82+
- react
83+
- rewoo
84+
num_demonstrations: # overrides num demonstrations above
85+
- 0
86+
- 3
87+
- 5
88+
```
89+
90+
91+
```python title="examples/optimizer/gsm8k_evaluator.py" linenums="1"
92+
--8<-- "./examples/optimizer/gsm8k_evaluator.py"
93+
```
94+
95+
We can see an example of a script to run the optimization process in `examples/optimizer/optimize.py`.
96+
Usage:
97+
98+
```
99+
python optimize.py optimize -h
100+
usage: optimize.py optimize [-h] --config CONFIG --dataset-path DATASET_PATH [--experiments-path EXPERIMENTS_PATH]
101+
[--yield_output | --no-yield_output] [--dry | --no-dry]
102+
pdl_file
103+
```
104+
105+
We also need a dataset to optimize against, with `train`, `test`, and `validation` splits. To produce such a dataset, we can use HuggingFace Datasets `load_dataset` and `save_to_disk`. This example requires the dataset to have columns `question`, `reasoning`, and `answer`, which can be created from the original `openai/gsm8k` dataset. Processing scripts are under development and will follow shortly.
106+
107+
We can run an example like so:
108+
109+
```
110+
cd examples/optimizer
111+
python optimize.py optimize --config config.yml --dataset-path datasets/gsm8k gsm8k.pdl
112+
```
113+
114+
Once the process is complete, a file `optimized_gsm8k.pdl` is written. This file contains the optimal configuration and is directly executable by the standard PDL interpreter.

examples/optimizer/__init__.py

Whitespace-only changes.

examples/optimizer/config.yml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
benchmark: "gsm8k"
2+
initial_test_set_size: 1
3+
max_test_set_size: 1
4+
num_candidates: 5
5+
num_demonstrations: 3
6+
parallelism: 1
7+
shuffle_test: false
8+
test_set_name: "test"
9+
train_set_name: "train"
10+
timeout: 120
11+
experiment_prefix: "granite_3_8b_instruct_gsm8k_3_shot_"
12+
variables:
13+
model:
14+
- "watsonx_text/ibm/granite-3-8b-instruct"
15+
prompt_pattern:
16+
- "cot"
17+
num_demonstrations:
18+
- 3

examples/optimizer/fever.pdl

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
description: Demo of ReAct template fever
2+
defs:
3+
cot:
4+
import: ../../contrib/prompt_library/CoT
5+
react:
6+
import: ../../contrib/prompt_library/ReAct
7+
rewoo:
8+
import: ../../contrib/prompt_library/ReWoo
9+
tools:
10+
import: ../../contrib/prompt_library/tools
11+
12+
search_tools:
13+
data:
14+
- name: Search
15+
description: Search Wikipedia for a summary
16+
parameters:
17+
type: object
18+
properties:
19+
topic:
20+
type: string
21+
description: The topic of interest
22+
required:
23+
- topic
24+
25+
task: |-
26+
Task: On June 2017, the following claim was made: ${ claim }
27+
Q: Was this claim true or false?
28+
match: ${ prompt_pattern }
29+
with:
30+
# CoT
31+
- case: cot
32+
then:
33+
text:
34+
call: ${ cot.chain_of_thought }
35+
args:
36+
examples: "${ demonstrations }"
37+
question: "${ task }"
38+
model: "${ model }"
39+
40+
# ReAct
41+
- case: react
42+
then:
43+
text:
44+
call: ${ react.react }
45+
args:
46+
task: ${ task }
47+
model: ${ model }
48+
tool_schema: ${ search_tools }
49+
tools: ${ tools.tools }
50+
trajectories: ${ demonstrations }
51+
52+
# ReWOO
53+
- case: rewoo
54+
then:
55+
text:
56+
call: ${ rewoo.rewoo }
57+
args:
58+
task: ${ task }
59+
model: ${ model }
60+
tool_schema: ${ search_tools }
61+
tools: ${ tools.tools }
62+
trajectories: ${ demonstrations }
63+
show_plans: false

0 commit comments

Comments
 (0)