-
Notifications
You must be signed in to change notification settings - Fork 956
wrap braintrust to get llm usage data #637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🦋 Changeset detectedLatest commit: 142a8af The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
constructor(error?: unknown) { | ||
if (error instanceof Error || error instanceof StagehandError) { | ||
super( | ||
`\nHey! We're sorry you ran into an error. \nIf you need help, please open a Github issue or reach out to us on Slack: https://stagehand.dev/slack\n\nFull error:\n${error.message}`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
helps us see the stacktrace when StagehandDefaultError
is thrown
@@ -3,7 +3,7 @@ import dotenv from "dotenv"; | |||
dotenv.config(); | |||
|
|||
const StagehandConfig: ConstructorParams = { | |||
verbose: 1 /* Verbosity level for logging: 0 = silent, 1 = info, 2 = all */, | |||
verbose: 2 /* Verbosity level for logging: 0 = silent, 1 = info, 2 = all */, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can change back but nice to have
cd6e068
to
3bc4a7c
Compare
"@browserbasehq/stagehand": patch | ||
--- | ||
|
||
Fix: forward along the stack trace in StagehandDefaultError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now it's throwing the error multiple times, need a fast follow PR to only throw StagehandDefaultError
once
@@ -1,19 +1,22 @@ | |||
import { Stagehand } from "@/dist"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently not running these in CI. Will add in a fast-follow PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the run command for these? what is the benefit in taking them out of the tasks
directory? Also make sure you remove the step from CI, otherwise it will keep failing
evals/args.ts
Outdated
@@ -66,6 +66,8 @@ const DEFAULT_EVAL_CATEGORIES = process.env.EVAL_CATEGORIES | |||
"regression_llm_providers", | |||
"regression_text_extract", | |||
"regression_dom_extract", | |||
"llm_clients", | |||
"unit", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently running neither in CI, will add in a fast-follow PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
This PR centralizes Stagehand initialization to support Braintrust LLM metrics and standardizes error logging across evaluations.
- /evals/evals.config.json: Reassigned evaluation categories and removed obsolete tasks.
- /evals/initStagehand.ts: Removed modelName support; now requires a pre-initialized llmClient.
- /evals/index.eval.ts: Wrapped LLM client initialization with Braintrust proxy and unified error forwarding.
- /evals/tasks/*: All tasks now accept an externally provided stagehand instance (plus debugUrl/sessionUrl), eliminating internal initStagehand calls.
- /evals/logger.ts & /lib/StagehandPage.ts: Improved error logging with full stack trace and message forwarding.
88 file(s) reviewed, 1 comment(s)
Edit PR Review Bot Settings | Greptile
constructor(error?: unknown) { | ||
if (error instanceof Error || error instanceof StagehandError) { | ||
super( | ||
`\nHey! We're sorry you ran into an error. \nIf you need help, please open a Github issue or reach out to us on Slack: https://stagehand.dev/slack\n\nFull error:\n${error.message}`, | ||
); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Ensure the constructor always calls super, even when error is undefined or not an instance of Error.
constructor(error?: unknown) { | |
if (error instanceof Error || error instanceof StagehandError) { | |
super( | |
`\nHey! We're sorry you ran into an error. \nIf you need help, please open a Github issue or reach out to us on Slack: https://stagehand.dev/slack\n\nFull error:\n${error.message}`, | |
); | |
} | |
} | |
constructor(error?: unknown) { | |
if (error instanceof Error || error instanceof StagehandError) { | |
super( | |
`\nHey! We're sorry you ran into an error. \nIf you need help, please open a Github issue or reach out to us on Slack: https://stagehand.dev/slack\n\nFull error:\n${error.message}`, | |
); | |
} else { | |
super('An unknown error occurred. If you need help, please open a Github issue or reach out to us on Slack: https://stagehand.dev/slack'); | |
} | |
} |
evals/initStagehand.ts
Outdated
sessionUrl: string; | ||
useTextExtract: boolean; | ||
stagehandConfig: ConstructorParams; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MoveStagehandInitResult
to types/evals.ts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why delete?
Co-authored-by: Sean McGuire <[email protected]>
2801646
to
78a21b1
Compare
why
what changed
unit
CI test for testing core features inact
initStagehand
in each evaltest plan
this is it