Skip to content

Conversation

@gn00295120
Copy link
Contributor

@gn00295120 gn00295120 commented Oct 21, 2025

Fixes #889

Summary

When an input guardrail triggers a tripwire, tools (especially hosted tools like FileSearchTool) should not execute. Previously, input guardrails ran in parallel with model requests, creating a race condition where tools could execute before guardrails completed.

Changes

Modified src/agents/run.py to execute input guardrails sequentially before model requests:

  1. Run input guardrails first and wait for completion
  2. If any guardrail triggers, raise InputGuardrailTripwireTriggered
  3. Only proceed with model request if all guardrails pass

Benefits

  • No model requests sent if guardrails trigger
  • Hosted tools (FileSearchTool) don't execute unnecessarily
  • Token consumption prevented when input is blocked
  • Slow guardrails (e.g., calling external services) work correctly

Performance Impact

This change introduces sequential execution instead of parallel execution:

Before (parallel):

Total time = max(guardrail_time, model_request_time)
Example: max(50ms, 500ms) = 500ms

After (sequential):

Total time = guardrail_time + model_request_time
Example: 50ms + 500ms = 550ms (+10%)

Trade-off Analysis:

Typical impact is small (2-5% latency increase):

  • Most guardrails are fast: 10-50ms (regex, keyword checks, simple logic)
  • Model requests dominate: 500-2000ms (API round-trip + LLM generation)
  • Added latency: ~10-50ms

For slow guardrails (200-500ms, e.g., external API calls), impact is larger (10-25%).

Why sequential execution is necessary:

  • Hosted tools (like FileSearchTool) execute on OpenAI servers
  • Once the API request is sent, it cannot be cancelled
  • Parallel execution causes token consumption even when guardrails block the input
  • The only correct solution is to not send the request until guardrails pass

Cost savings outweigh latency:

  • Prevents unnecessary token consumption when input is blocked
  • Ensures guardrails work correctly in all cases (fast and slow)
  • Prioritizes correctness and reliability over marginal performance gains

Testing

Comprehensive test file attached: tests/test_issue_889_guardrail_tool_execution.py

Tests verify:

  • Fast guardrails prevent model execution
  • Slow guardrails prevent model execution (race condition fix)
  • Multiple guardrails work correctly
  • Normal flow (guardrails pass) still works

All existing tests pass (810 tests).

Fixes openai#889

When an input guardrail triggers a tripwire, tools (especially hosted
tools like FileSearchTool) should not execute. Previously, input
guardrails ran in parallel with model requests, creating a race
condition where tools could execute before guardrails completed.

This change makes input guardrail execution sequential:
1. Run input guardrails first and wait for completion
2. If any guardrail triggers, raise InputGuardrailTripwireTriggered
3. Only proceed with model request if all guardrails pass

This ensures that:
- No model requests are sent if guardrails trigger
- Hosted tools (FileSearchTool) don't execute unnecessarily
- Token consumption is prevented when input is blocked
- Slow guardrails (e.g., calling external services) work correctly
@gn00295120
Copy link
Contributor Author

Test File

Attaching comprehensive test file for Issue #889:

File: tests/test_issue_889_guardrail_tool_execution.py

This test file includes 4 test cases that verify the fix:

  1. test_input_guardrail_prevents_model_request_when_triggered - Core test verifying model requests are blocked when guardrails trigger
  2. test_input_guardrail_allows_model_request_when_safe - Baseline test ensuring normal flow still works
  3. test_multiple_input_guardrails_block_on_first_trigger - Tests multiple guardrails scenario
  4. test_slow_guardrail_still_blocks_model_request - Critical test for race condition with slow guardrails (this test failed before the fix, passes after)

All tests pass after the fix. Test #4 specifically demonstrates the race condition bug and verifies the sequential execution fix.

@gn00295120
Copy link
Contributor Author

"""
Tests for Issue #889: FileSearchTool runs despite InputGuardrailTripwireTriggered

When an input guardrail triggers a tripwire, tools (especially hosted tools like FileSearchTool)
should not execute at all. Currently, there's a race condition where the model request is sent
in parallel with input guardrail checks, causing tools to execute before guardrails complete.

Issue: #889
"""

from future import annotations

import asyncio
from typing import Any

import pytest

from agents import (
Agent,
GuardrailFunctionOutput,
InputGuardrail,
RunContextWrapper,
Runner,
TResponseInputItem,
)
from agents.exceptions import InputGuardrailTripwireTriggered

from .fake_model import FakeModel
from .test_responses import get_text_message

class TestInputGuardrailBlocksToolExecution:
"""Test that input guardrails completely block tool execution when triggered."""

@pytest.fixture
def triggered_guardrail(self):
    """Guardrail that always triggers"""
    def guardrail_func(
        context: RunContextWrapper[Any],
        agent: Agent[Any],
        input: str | list[TResponseInputItem]
    ):
        return GuardrailFunctionOutput(
            output_info={"reason": "Input blocked by guardrail"},
            tripwire_triggered=True
        )
    return InputGuardrail(guardrail_function=guardrail_func, name="block_all")

@pytest.fixture
def safe_guardrail(self):
    """Guardrail that never triggers"""
    def guardrail_func(
        context: RunContextWrapper[Any],
        agent: Agent[Any],
        input: str | list[TResponseInputItem]
    ):
        return GuardrailFunctionOutput(
            output_info={"reason": "Input is safe"},
            tripwire_triggered=False
        )
    return InputGuardrail(guardrail_function=guardrail_func, name="allow_all")

@pytest.mark.asyncio
async def test_input_guardrail_prevents_model_request_when_triggered(
    self, triggered_guardrail
):
    """
    CORE TEST: When input guardrail triggers, no model request should be sent.

    Current behavior (BROKEN):
    - Input guardrail runs in parallel with model request
    - Model request is sent immediately
    - Hosted tools (FileSearchTool) execute on OpenAI servers
    - Tokens are consumed even though guardrail triggered

    Expected behavior (FIX):
    - Input guardrail runs first
    - If triggered, model request is never sent
    - No tools execute (hosted or local)
    - No tokens consumed
    """
    # Create a fake model that tracks if get_response was called
    model = FakeModel()
    model.set_next_output([get_text_message("This should not be returned")])

    agent = Agent(
        name="test_agent",
        instructions="Test agent",
        model=model,
        input_guardrails=[triggered_guardrail]
    )

    # Run should raise InputGuardrailTripwireTriggered
    with pytest.raises(InputGuardrailTripwireTriggered):
        await Runner.run(
            starting_agent=agent,
            input="Test input that should be blocked"
        )

    # CRITICAL: Model's get_response should never be called
    # The output should still be in the queue (not consumed)
    assert len(model.turn_outputs) == 1, (
        "Model request was sent even though input guardrail triggered! "
        "This causes unnecessary token consumption and tool execution."
    )

@pytest.mark.asyncio
async def test_input_guardrail_allows_model_request_when_safe(
    self, safe_guardrail
):
    """
    BASELINE TEST: When input guardrail doesn't trigger, model request should proceed.

    This ensures our fix doesn't break the normal flow.
    """
    model = FakeModel()
    model.set_next_output([get_text_message("Safe response")])

    agent = Agent(
        name="test_agent",
        instructions="Test agent",
        model=model,
        input_guardrails=[safe_guardrail]
    )

    # Should complete without raising exception
    result = await Runner.run(
        starting_agent=agent,
        input="Test input that is safe"
    )

    # Model request SHOULD be sent when guardrail passes
    # The output should be consumed (queue should be empty)
    assert len(model.turn_outputs) == 0, (
        "Model request was not sent even though input guardrail passed! "
        "Our fix broke the normal flow."
    )
    assert result.final_output == "Safe response"

@pytest.mark.asyncio
async def test_multiple_input_guardrails_block_on_first_trigger(
    self, triggered_guardrail, safe_guardrail
):
    """
    Test that if any input guardrail triggers, execution stops immediately.
    """
    model = FakeModel()
    model.set_next_output([get_text_message("Should not be returned")])

    agent = Agent(
        name="test_agent",
        instructions="Test agent",
        model=model,
        input_guardrails=[safe_guardrail, triggered_guardrail]  # Second one triggers
    )

    with pytest.raises(InputGuardrailTripwireTriggered):
        await Runner.run(
            starting_agent=agent,
            input="Test input"
        )

    # Model should not be called
    assert len(model.turn_outputs) == 1

@pytest.mark.asyncio
async def test_slow_guardrail_still_blocks_model_request(self):
    """
    Test that even if a guardrail is slow, the model request waits for it.

    This is important to ensure we don't have a race condition where fast
    model requests bypass slow guardrail checks.
    """
    async def slow_triggered_guardrail(
        context: RunContextWrapper[Any],
        agent: Agent[Any],
        input: str | list[TResponseInputItem]
    ):
        # Simulate a slow guardrail (e.g., calling an external service)
        await asyncio.sleep(0.1)
        return GuardrailFunctionOutput(
            output_info={"reason": "Blocked after analysis"},
            tripwire_triggered=True
        )

    model = FakeModel()
    model.set_next_output([get_text_message("Should not be returned")])

    agent = Agent(
        name="test_agent",
        instructions="Test agent",
        model=model,
        input_guardrails=[InputGuardrail(guardrail_function=slow_triggered_guardrail)]
    )

    with pytest.raises(InputGuardrailTripwireTriggered):
        await Runner.run(
            starting_agent=agent,
            input="Test input"
        )

    # Even with a slow guardrail, model request should not be sent
    assert len(model.turn_outputs) == 1

Test Summary:

1. test_input_guardrail_prevents_model_request_when_triggered

- Core test that will FAIL before fix, PASS after fix

- Verifies no model request sent when guardrail triggers

2. test_input_guardrail_allows_model_request_when_safe

- Baseline test that should PASS before and after fix

- Ensures normal flow still works

3. test_multiple_input_guardrails_block_on_first_trigger

- Tests multiple guardrails scenario

- Will FAIL before fix, PASS after fix

4. test_slow_guardrail_still_blocks_model_request

- Tests race condition with slow guardrails

- Will FAIL before fix, PASS after fix

- Critical for ensuring sequential execution

@seratch
Copy link
Member

seratch commented Oct 21, 2025

Thanks for sending this patch, but we don't accept this for the same reason with #1914 (comment)

@seratch seratch closed this Oct 21, 2025
@gn00295120
Copy link
Contributor Author

Thanks for sending this patch, but we don't accept this for the same reason with #1914 (comment)

PR #1622 better but it is close , should we continue it ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FileSearchTool runs despite InputGuardrailTripwireTriggered

2 participants