Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 3 additions & 22 deletions src/cleanlab_tlm/utils/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -755,32 +755,14 @@ def _responses_messages_to_string(messages: list[dict[str, Any]]) -> str:
)

if message["action"]["type"] == "search":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{'id': 'msg_0dcf5fddf778d0a40068f295f3e31c819bbb7e6bfa4c8362d5',
 'content': [{'annotations': [{'end_index': 401,
     'start_index': 298,
     'title': 'New Toyota Innova Crysta Review (2.8-AT and 2.4-MT) | Motoroids',
     'type': 'url_citation',
     'url': 'https://www.motoroids.com/reviews/new-toyota-innova-crysta-review-2-8-at-and-2-4-mt/'},
    {'end_index': 1047,
     'start_index': 921,
     'title': 'Mahindra XUV700 Review - Team-BHP',
     'type': 'url_citation',
     'url': 'https://www.team-bhp.com/forum/official-new-car-reviews/240680-mahindra-xuv700-review.html?utm_source=openai'},
    {'end_index': 1400,
     'start_index': 1297,
     'title': 'New Toyota Innova Crysta Review (2.8-AT and 2.4-MT) | Motoroids',
     'type': 'url_citation',
     'url': 'https://www.motoroids.com/reviews/new-toyota-innova-crysta-review-2-8-at-and-2-4-mt/'},
    {'end_index': 1648,
     'start_index': 1517,
     'title': 'Toyota Innova Crysta ownership: Observation after 2 years & 20,000 km | Team-BHP',
     'type': 'url_citation',
     'url': 'https://www.team-bhp.com/news/toyota-innova-crysta-ownership-observation-after-2-years-20000-km?utm_source=openai'},
    {'end_index': 1978,
     'start_index': 1852,
     'title': 'Mahindra XUV700 Review - Team-BHP',
     'type': 'url_citation',
     'url': 'https://www.team-bhp.com/forum/official-new-car-reviews/240680-mahindra-xuv700-review.html?utm_source=openai'},
    {'end_index': 2317,
     'start_index': 2232,
     'title': 'Mahindra is Mahindra',
     'type': 'url_citation',
     'url': 'https://www.reddit.com/r/CarsIndia/comments/175g7ai?utm_source=openai'},
    {'end_index': 2620,
     'start_index': 2517,
     'title': 'New Toyota Innova Crysta Review (2.8-AT and 2.4-MT) | Motoroids',
     'type': 'url_citation',
     'url': 'https://www.motoroids.com/reviews/new-toyota-innova-crysta-review-2-8-at-and-2-4-mt/'},
    {'end_index': 3047,
     'start_index': 2921,
     'title': 'Mahindra XUV700 Review - Team-BHP',
     'type': 'url_citation',
     'url': 'https://www.team-bhp.com/forum/official-new-car-reviews/240680-mahindra-xuv700-review.html?utm_source=openai'}],
   'text': 'Short summary (quick answer)\n- Most professional reviews and long‑term owner reports say the Innova Crysta diesel is the more refined/quieter car in day‑to‑day and long‑distance use — diesel clatter is well contained at idle/cruise, but the engine becomes noticeably audible when you push it hard. ([motoroids.com](https://www.motoroids.com/reviews/new-toyota-innova-crysta-review-2-8-at-and-2-4-mt/))  \n- The XUV700 (diesel) is praised for performance, features and overall composure, but its NVH record is mixed: professional tests find acceptable refinement at cruise, yet many owners and ownership threads report rattles, suspension thuds and door/trim vibrations (i.e., perceived poorer cabin fit‑and‑finish vs the Innova). If NVH/long‑haul serenity is the priority, reviewers/owners generally prefer the Innova; if you want features/performance and can tolerate (or fix) some cabin noise, the XUV700 is compelling. ([team-bhp.com](https://www.team-bhp.com/forum/official-new-car-reviews/240680-mahindra-xuv700-review.html?utm_source=openai))\n\nWhat the reviews say (key points + sources)\n- Innova Crysta (diesel)\n  - “Feels like a tomb” at idle / cruise — diesel clatter well contained; excellent highway ride and isolation; engine noise rises under heavy throttle. (first drives & reviews). ([motoroids.com](https://www.motoroids.com/reviews/new-toyota-innova-crysta-review-2-8-at-and-2-4-mt/))  \n  - Long‑term/owner writeups confirm improved NVH over older Innova generations and strong long‑distance comfort. ([team-bhp.com](https://www.team-bhp.com/news/toyota-innova-crysta-ownership-observation-after-2-years-20000-km?utm_source=openai))\n\n- XUV700 (diesel)\n  - Professional reviews note generally good refinement for the 2.0 engines and a composed ride, but also mention wind/road noise at higher speeds and a “trucky” feel in some reviews. ([team-bhp.com](https://www.team-bhp.com/forum/official-new-car-reviews/240680-mahindra-xuv700-review.html?utm_source=openai))  \n  - Numerous owner/ownership threads report rattles, creaks, suspension thuds and occasional audible vibrations (especially on rough roads or from door panels) — these real‑world reports are the main reason XUV700’s NVH is called “mixed” in practice. ([reddit.com](https://www.reddit.com/r/CarsIndia/comments/175g7ai?utm_source=openai))\n\nBottom line / recommendation\n- If your priority is the quietest, most comfortable diesel cabin for long trips and you want proven long‑term refinement, the Innova Crysta (diesel) is the safer pick. ([motoroids.com](https://www.motoroids.com/reviews/new-toyota-innova-crysta-review-2-8-at-and-2-4-mt/))  \n- If you prioritise tech, performance and value but can accept / mitigate some cabin rattles (or are ready to apply aftermarket sound‑deadening), the XUV700 offers more features and punch — just check the specific car carefully at delivery and look for trim/rattle issues on a thorough test drive. ([team-bhp.com](https://www.team-bhp.com/forum/official-new-car-reviews/240680-mahindra-xuv700-review.html?utm_source=openai))\n\nIf you want I can:\n- Pull the most recent head‑to‑head reviews/videos (with exact publish dates) from autosites and YouTube and list them (so you can watch/compare NVH samples).  \n- Compile a short table of quoted NVH observations (idle, cruise, under acceleration, trim rattles) with exact publication dates and direct links.  \n\nWhich would you prefer?',
   'type': 'output_text',
   'logprobs': []}],
 'role': 'assistant',
 'status': 'completed',
 'type': 'message'}

Could you cover this?
Code:

client = OpenAI()

response5 = client.responses.create(
    model="gpt-5-mini",
    input=[{"role": "user", "content": "Find recent reviews comparing Innova Crysta vs XUV700 diesel NVH and cite sources."}],
    tools=[{"type": "web_search"}],

    instructions="Use web search and cite sources."
)

wds = form_response_string_responses_api(response5)

Has

[{'id': 'msg_0dcf5fddf778d0a40068f295f3e31c819bbb7e6bfa4c8362d5', 'content': [{'annotations': [{'end_index': 401, 'start_index': 298, 'title': 'New Toyota Innova Crysta Review (2.8-AT and 2.4-MT) | Motoroids', 'type': 'url_citation', 'url': 'https://www.motoroids.com/reviews/new-toyota-innova-crysta-review-2-8-at-and-2-4-mt/'}, {'end_index': 1047, 'start_index': 921, 'title': 'Mahindra XUV700 Review - Team-BHP', 'type': 'url_citation', 'url': 'https://www.team-bhp.com/forum/official-new-car-reviews/240680-mahindra-xuv700-review.html?utm_source=openai'}, {'end_index': 1400, 'start_index': 1297, 'title': 'New Toyota Innova Crysta Review (2.8-AT and 2.4-MT) | Motoroids', 'type': 'url_citation', 'url': 'https://www.motoroids.com/reviews/new-toyota-innova-crysta-review-2-8-at-and-2-4-mt/'}, {'end_index': 1648, 'start_index': 1517, 'title': 'Toyota Innova Crysta ownership: Observation after 2 years & 20,000 km | Team-BHP', 'type': 'url_citation', 'url': 'https://www.team-bhp.com/news/toyota-innova-crysta-ownership-observation-after-2-years-20000-km?utm_source=openai'}, {'end_index': 1978, 'start_index': 1852, 'title': 'Mahindra XUV700 Review - Team-BHP', 'type': 'url_citation', 'url': 'https://www.team-bhp.com/forum/official-new-car-reviews/240680-mahindra-xuv700-review.html?utm_source=openai'}, {'end_index': 2317, 'start_index': 2232, 'title': 'Mahindra is Mahindra', 'type': 'url_citation', 'url': 'https://www.reddit.com/r/CarsIndia/comments/175g7ai?utm_source=openai'}, {'end_index': 2620, 'start_index': 2517, 'title': 'New Toyota Innova Crysta Review (2.8-AT and 2.4-MT) | Motoroids', 'type': 'url_citation', 'url': 'https://www.motoroids.com/reviews/new-toyota-innova-crysta-review-2-8-at-and-2-4-mt/'}, {'end_index': 3047, 'start_index': 2921, 'title': 'Mahindra XUV700 Review - Team-BHP', 'type': 'url_citation', 'url': 'https://www.team-bhp.com/forum/official-new-car-reviews/240680-mahindra-xuv700-review.html?utm_source=openai'}], 'text': 'Short summary (quick answer)\n- Most professional reviews and long‑term owner reports say the Innova Crysta diesel is the more refined/quieter car in day‑to‑day and long‑distance use — diesel clatter is well contained at idle/cruise, but the engine becomes noticeably audible when you push it hard. ([motoroids.com](https://www.motoroids.com/reviews/new-toyota-innova-crysta-review-2-8-at-and-2-4-mt/))  \n- The XUV700 (diesel) is praised for performance, features and overall composure, but its NVH record is mixed: professional tests find acceptable refinement at cruise, yet many owners and ownership threads report rattles, suspension thuds and door/trim vibrations (i.e., perceived poorer cabin fit‑and‑finish vs the Innova). If NVH/long‑haul serenity is the priority, reviewers/owners generally prefer the Innova; if you want features/performance and can tolerate (or fix) some cabin noise, the XUV700 is compelling. ([team-bhp.com](https://www.team-bhp.com/forum/official-new-car-reviews/240680-mahindra-xuv700-review.html?utm_source=openai))\n\nWhat the reviews say (key points + sources)\n- Innova Crysta (diesel)\n  - “Feels like a tomb” at idle / cruise — diesel clatter well contained; excellent highway ride and isolation; engine noise rises under heavy throttle. (first drives & reviews). ([motoroids.com](https://www.motoroids.com/reviews/new-toyota-innova-crysta-review-2-8-at-and-2-4-mt/))  \n  - Long‑term/owner writeups confirm improved NVH over older Innova generations and strong long‑distance comfort. ([team-bhp.com](https://www.team-bhp.com/news/toyota-innova-crysta-ownership-observation-after-2-years-20000-km?utm_source=openai))\n\n- XUV700 (diesel)\n  - Professional reviews note generally good refinement for the 2.0 engines and a composed ride, but also mention wind/road noise at higher speeds and a “trucky” feel in some reviews. ([team-bhp.com](https://www.team-bhp.com/forum/official-new-car-reviews/240680-mahindra-xuv700-review.html?utm_source=openai))  \n  - Numerous owner/ownership threads report rattles, creaks, suspension thuds and occasional audible vibrations (especially on rough roads or from door panels) — these real‑world reports are the main reason XUV700’s NVH is called “mixed” in practice. ([reddit.com](https://www.reddit.com/r/CarsIndia/comments/175g7ai?utm_source=openai))\n\nBottom line / recommendation\n- If your priority is the quietest, most comfortable diesel cabin for long trips and you want proven long‑term refinement, the Innova Crysta (diesel) is the safer pick. ([motoroids.com](https://www.motoroids.com/reviews/new-toyota-innova-crysta-review-2-8-at-and-2-4-mt/))  \n- If you prioritise tech, performance and value but can accept / mitigate some cabin rattles (or are ready to apply aftermarket sound‑deadening), the XUV700 offers more features and punch — just check the specific car carefully at delivery and look for trim/rattle issues on a thorough test drive. ([team-bhp.com](https://www.team-bhp.com/forum/official-new-car-reviews/240680-mahindra-xuv700-review.html?utm_source=openai))\n\nIf you want I can:\n- Pull the most recent head‑to‑head reviews/videos (with exact publish dates) from autosites and YouTube and list them (so you can watch/compare NVH samples).  \n- Compile a short table of quoted NVH observations (idle, cruise, under acceleration, trim rattles) with exact publication dates and direct links.  \n\nWhich would you prefer?', 'type': 'output_text', 'logprobs': []}], 'role': 'assistant', 'status': 'completed', 'type': 'message'}]

'content': [{'annotations': [{'end_index': 401,
'start_index': 298,
'title': 'New Toyota Innova Crysta Review (2.8-AT and 2.4-MT) | Motoroids',
'type': 'url_citation',
'url': 'https://www.motoroids.com/reviews/new-toyota-innova-crysta-review-2-8-at-and-2-4-mt/'},

type: url_citation

Copy link
Contributor

@aditya1503 aditya1503 Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2025-10-18 at 12 56 07 AM These URLs are not being detected in web search call, irrespective of 4.1/5 mini

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aditya1503 You forgot to put include=["web_search_call.action.sources"] in your message call.

Copy link
Member

@jwmueller jwmueller Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tonyhrule are you able to raise an informative error message when that happens? That'll probably happen often? Especially given only one person has ever tried your code and they forgot to add it

if "annotations" in message:
annotations = message["annotations"]
else:
next_text_message = adjusted_messages[i + 1]
if next_text_message["type"] != "message":
continue
next_text_content = next_text_message["content"][0]
if next_text_content["type"] != "output_text":
continue
annotations = next_text_content["annotations"]

urls = list(
{
(annotation["url"], annotation["title"])
for annotation in annotations
if annotation["type"] == "url_citation"
}
)
urls = list({source["url"] for source in message["action"]["sources"] if source["type"] == "url"})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you confirm that you tested this update on both gpt-4.1 series and gpt-5 series models?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes


with ThreadPoolExecutor() as executor:

def extract_text(pair: tuple[str, str]) -> str:
def extract_text(url: str) -> str:
fallback_text = "Response is not shown, but the LLM can still access it. Assume that whatever the LLM references in this URL is true."

try:
url = pair[0]
if url in _url_cache:
return _url_cache[url]

Expand Down Expand Up @@ -814,10 +796,9 @@ def extract_text(pair: tuple[str, str]) -> str:
websites = [
{
"url": url,
"title": title,
"content": data,
}
for (url, title), data in zip(urls, requests)
for url, data in zip(urls, requests)
]

tool_call = {
Expand Down
14 changes: 13 additions & 1 deletion tests/test_chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
)
from openai.types.responses.response_function_web_search import (
ActionSearch,
ActionSearchSource,
ResponseFunctionWebSearch,
)
from openai.types.responses.response_output_message import ResponseOutputMessage
Expand Down Expand Up @@ -1308,6 +1309,12 @@ def test_form_prompt_string_responses_web_search() -> None:
action=ActionSearch(
query="Give me a positive news story from today",
type="search",
sources=[
ActionSearchSource(
type="url",
url="https://www.podego.com/insights/august-2025-good-news-ai-pfas-stories?utm_source=openai",
),
],
),
status="completed",
type="web_search_call",
Expand Down Expand Up @@ -1394,7 +1401,6 @@ def test_form_prompt_string_responses_web_search() -> None:
"output": [
{
"url": "https://www.podego.com/insights/august-2025-good-news-ai-pfas-stories?utm_source=openai",
"title": "Positive News Highlights | AI, PFAS Breakthroughs & More \\u2014 August 2025 \\u2014 Podego",
"content": "MOCK CONTENT"
}
]
Expand Down Expand Up @@ -2342,6 +2348,12 @@ def test_form_response_string_responses_web_search() -> None:
action=ActionSearch(
query="Give me a positive news story from today",
type="search",
sources=[
ActionSearchSource(
type="url",
url="https://www.podego.com/insights/august-2025-good-news-ai-pfas-stories?utm_source=openai",
),
],
),
status="completed",
type="web_search_call",
Expand Down