Skip to content

Conversation

@Fottas
Copy link

@Fottas Fottas commented Aug 20, 2025

Related to #36454

@Fottas Fottas force-pushed the feature/shardingkey-in-rewrite branch 2 times, most recently from 315b144 to 9e872cf Compare August 21, 2025 07:01
@Fottas Fottas force-pushed the feature/shardingkey-in-rewrite branch from 0e2c819 to a9802f8 Compare August 29, 2025 08:23
Copy link
Member

@strongduanmu strongduanmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Fottas, this PR looks very complicated. Could you please submit an issue first to describe your design?

@Fottas
Copy link
Author

Fottas commented Sep 2, 2025

Hi @Fottas, this PR looks very complicated. Could you please submit an issue first to describe your design?

Hi @strongduanmu:
Issue Description: IN Predicate Sharding Optimization

Problem: Currently ShardingSphere sends all values in IN predicates to every shard, causing unnecessary data transfer and computation. For example, WHERE id IN (1,2,3,4) is sent as-is to all shards, even though based on sharding keys we can determine that values 1,3 belong to shard A while values 2,4 belong to shard B.
Solution: Route-aware IN predicate splitting that distributes values only to their target shards based on sharding algorithms.

Technical Design
Implementation Principle: Parse each value in IN expressions, invoke existing sharding algorithms to calculate target shards, then rewrite IN clauses for each RouteUnit to contain only values belonging to that shard.

Core Components:

  1. ShardingInPredicateValue - Encapsulates IN values with their target route information
  2. ShardingInPredicateToken - SQL token that rewrites IN clauses per route, filtering values and parameters
  3. ShardingInPredicateTokenGenerator - Analyzes IN expressions and calculates shard distribution using existing sharding
    algorithms
  4. ParameterFilterable - Cross-module decoupling interface that enables parameter filtering capability between modules, since ShardingInPredicateToken is in the sharding module while RouteSQLRewriteEngine is in the infra module

Key Features:

  • Supports both standard and complex sharding strategies
  • Handles parameter markers and literals
  • Optimizes single-value cases to equality comparisons
  • Achieves module decoupling through interface abstraction, allowing core engine to work without direct dependencies on sharding token implementations

@Fottas Fottas requested a review from strongduanmu September 2, 2025 06:47
@Fottas
Copy link
Author

Fottas commented Sep 2, 2025

Hi @strongduanmu, I've also created issue #36454 to track this design discussion as requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants