fix: add batching support for BanCompetitors to handle long input text #272

abdokaseb · 2025-07-30T15:00:50Z

Change Description

Previously, the BanCompetitors class failed to work with long text because it attempted to process the entire input text in a single request. This caused incomplete processing when the input text exceeded the model's token limit as it truncates input text.

This PR introduces batching logic to split the input text into manageable chunks before sending it to the model. This ensures that BanCompetitors works correctly even with long inputs and models that have lower token limits. Also added a test case for long text that fail for old code and now it pass correctly.

Assumptions:
For fast processing I used approximate token count equation that I already mentioned in the code 1 word ~ 4 characters ~ 2 tokens

Issue reference

N/A – discovered during usage with large text inputs. Please let me know if you'd like me to open a tracking issue.

Checklist

I have reviewed the contribution guidelines
My code includes unit tests
All unit tests and lint checks pass locally
My PR contains documentation updates / additions if required

llm_guard/input_scanners/ban_competitors.py

fix: add batching support for BanCompetitors to handle long input text

58e408e

abdokaseb requested a review from asofter as a code owner July 30, 2025 15:00

Fix pre-commit hooks

4a386b0

asofter reviewed Jul 30, 2025

View reviewed changes

llm_guard/input_scanners/ban_competitors.py Outdated Show resolved Hide resolved

abdokaseb and others added 2 commits July 31, 2025 02:17

use normal arguments to ban competitors

7986914

Merge branch 'main' into fix-bancompetitors-batch-long-text

8af3eb0

asofter approved these changes Jul 31, 2025

View reviewed changes

asofter merged commit 53af270 into protectai:main Jul 31, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: add batching support for BanCompetitors to handle long input text #272

fix: add batching support for BanCompetitors to handle long input text #272

Uh oh!

abdokaseb commented Jul 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fix: add batching support for BanCompetitors to handle long input text #272

fix: add batching support for BanCompetitors to handle long input text #272

Uh oh!

Conversation

abdokaseb commented Jul 30, 2025

Change Description

Issue reference

Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!