-
Notifications
You must be signed in to change notification settings - Fork 146
Description
With materializations split into non-distributed (standard) and distributed (non-standard) versions — i.e., table vs. distributed_table and incremental vs. distributed_incremental — we often encounter issues due to redundancy and divergence in their implementations. This has led to distributed materializations lacking full feature support or containing bugs in duplicated sections of the codebase.
Some examples:
-
Schema change validation for contracted incremental models in
dbt-coreis only applied to"incremental"materializations. A related issue exists here, which prevents microbatches from running when using thedistributed_incrementalstrategy. -
Significant code duplication between the incremental and distributed_incremental materializations makes maintenance harder and introduces subtle bugs when switching between them.
Suggested Solution
Consolidate the distributed logic into the existing incremental and table materializations, controlled via a new model configuration flag (e.g., a boolean is_distributed). This would:
- Simplify the codebase by reducing redundancy.
- Ensure consistent feature support and behavior across both modes.
- Enable more comprehensive integration testing for both distributed and non-distributed materializations.
The existing distributed_incremental and distributed_table materializations could be retained as aliases (with deprecation warnings) for backwards compatibility.