-
Notifications
You must be signed in to change notification settings - Fork 57
Description
Summary
There are inconsistencies around the distance threshold scale in SemanticCache between sync and async code paths, and the docstring, which can lead to surprising behavior.
Details
- Docstring in set_threshold says: "ValueError: If the threshold is not between 0 and 1.", but the validation actually enforces 0 <= distance_threshold <= 2 (Redis COSINE native units). This is confusing to users reading the docstring.
- Async check path constructs VectorRangeQuery with normalize_vector_distance=true, which treats the provided distance_threshold as normalized [0,1] and denormalizes it internally. The sync path (and the rest of SemanticCache) appears to treat thresholds as native Redis COSINE distances [0,2]. This causes the same numeric threshold to be interpreted differently between sync and async.
Why it matters
Users may set a threshold like 0.15 expecting a tight native COSINE distance, but the async path will interpret it as normalized (and denormalize to 1.85), returning far more results than expected.
Proposed options
-
Keep thresholds in native units (0-2) everywhere
- Set normalize_vector_distance=false in the async VectorRangeQuery
- Update docstrings to consistently say 0-2
-
Adopt normalized thresholds (0-1) everywhere
- Keep normalize_vector_distance=true in async and set the same for sync
- Normalize on input before constructing queries in all paths
- Update docs accordingly
-
Make it configurable and explicit
- Add a distance_scale flag ("redis" vs "normalized") on SemanticCache, default to current behavior, and convert as needed.
Recommendation
Option 1 for least surprise and parity with Redis search defaults: use native COSINE [0-2] everywhere and normalize only when explicitly requested. At minimum, align sync/async and fix the docstring.
Action items
- Decide desired global convention
- Align sync and async VectorRangeQuery usage accordingly
- Update docstrings and README/examples to clarify the scale
Context
Related changes were explored in LangCacheWrapper to make threshold scale explicit. Tracking this issue separately for SemanticCache to avoid unintentional behavior changes.