Skip to content

Can't serialize example ExecutionPlan to substrait #9299

@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

While working on an example for serializing substrait plans (see #9260 PR), I found I could not write an example for serializing an execution plan. The feature is still not complete enough

Describe the solution you'd like

What I would like:

I would like to add the example (or something like it) to datafusion/substrait/src/lib.rs and have
it work.

//! # Example: Serializing [`ExecutionPlan`]s
//!
//! This functionality is still under development and only works for a small subset of plans
//!
//! ```
//! # use datafusion::prelude::*;
//! # use std::collections::HashMap;
//! # use std::sync::Arc;
//! # use datafusion::error::Result;
//! # use datafusion::arrow::array::Int32Array;
//! # use datafusion::arrow::record_batch::RecordBatch;
//! # use datafusion_substrait::physical_plan;
//! # #[tokio::main(flavor = "current_thread")]
//! # async fn main() -> Result<()>{
//! // Create a plan that scans table 't'
//!  let ctx = SessionContext::new();
//!  let batch = RecordBatch::try_from_iter(vec![("x", Arc::new(Int32Array::from(vec![42])) as _)])?;
//!  ctx.register_batch("t", batch)?;
//!  let df = ctx.sql("SELECT x from t").await?;
//!  let physical_plan = df.create_physical_plan().await?;
//!
//!  // Convert the plan into a substrait (protobuf) Rel
//!  let mut extension_info= (vec![], HashMap::new());
//!  let substrait_plan = physical_plan::producer::to_substrait_rel(physical_plan.as_ref(), &mut extension_info)?;
//!
//!  // Decode bytes from somewhere (over network, etc.) back to ExecutionPlan
//!  let physical_round_trip = physical_plan::consumer::from_substrait_rel(
//!     &ctx, &substrait_plan, &HashMap::new()
//!  ).await?;
//!  assert_eq!(format!("{:?}", physical_plan), format!("{:?}", physical_round_trip));
//! # Ok(())
//! # }

When you run this test today you get an error about "mem provider not implemented" or something like that

Describe alternatives you've considered

No response

Additional context

I think making this work would be a matter of implementing serialization of MemTable / MemExec perhaps

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or requestsubstraitChanges to the substrait crate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions