Skip to content

Conversation

@kylebarron
Copy link
Member

@kylebarron kylebarron commented Oct 23, 2024

To preface: not expecting this to be merged imminently, as it may be a while before object-store's next breaking release, and I figure there may be a cleaner way to change this. But it was useful to learn a bit more about streams, and I'll probably use this patch in my Python-facing object-store bindings.

Which issue does this PR close?

Closes #6587.

Rationale for this change

Easier wrapping of ObjectStore::list in bindings, e.g. to Python, that don't allow lifetime references.

What changes are included in this PR?

Change the return type of ObjectStore::list and ObjectStore::list_with_offset to

BoxStream<'static, Result<ObjectMeta>>

Are there any user-facing changes?

Yes.

@kylebarron
Copy link
Member Author

kylebarron commented Oct 23, 2024

In developmentseed/obstore#35 I verified that this does enable streaming list in Python bindings:

Screen.Recording.2024-10-23.at.12.25.19.PM.mov

@tustvold tustvold added api-change Changes to the arrow API next-major-release the PR has API changes and it waiting on the next major version labels Oct 23, 2024
@alamb alamb requested a review from tustvold November 7, 2024 22:46
@alamb
Copy link
Contributor

alamb commented Nov 7, 2024

Looks like @tustvold had a conversation on #6587 so I defer to him. However, I had a quick look at the code and it looks pretty good to me. Thank you @kylebarron

delimiter: bool,
offset: Option<&Path>,
) -> BoxStream<'_, Result<ListResult>>;
) -> BoxStream<'static, Result<ListResult>>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the implications of this stream having static lifetime? I don't understand the implications on downstream crates

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my use case, Python bindings with pyo3 don't allow you to have any lifetime parameters shorter than 'static: https://pyo3.rs/v0.22.5/class.html#no-lifetime-parameters

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right -- I guess I am wondering does this effectively mean that the returned streams can't be borrowed (aka they effectively have to be Arc'd or something like that where the stream is owned 🤔 )

Basically I am trying to think about what usecases, if any, that this additional constrait would be break (I realize some people may need to update the type signatures, I am thinking of more fundamental changes)

One thing maybe we could do is try to update DataFusion (and our internal INfluxDB code) with this change and see what, if anything, breaks for us 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would only effect implementers of the trait, its a stronger contract on what the implementation chooses to return. The result is consumers of object_store shouldn't notice much

@tustvold
Copy link
Contributor

tustvold commented Nov 8, 2024

This change makes sense to me, and is definitely something we should consider when looking to make a breaking releas. However, I think we want to get a few more such changes lined up before taking the plunge, object_store breaking changes are quite disruptive

@alamb
Copy link
Contributor

alamb commented Nov 22, 2024

For whatever it is worth, I was fighting with the lifetimes of list in

@alamb
Copy link
Contributor

alamb commented Jan 6, 2025

main is open for breaking changes for inclusion in 0.12.0

I merged up from main to resolve a conflict

@alamb alamb merged commit 74499c0 into apache:main Jan 8, 2025
14 checks passed
@alamb
Copy link
Contributor

alamb commented Jan 8, 2025

For anyone else following along, the practical impact of this change is that implementations of ObjectStore::list can't return something with references inside.

The rationale is that anything with references can't be sent safely across threads, etc

CurtHagenlocher pushed a commit to CurtHagenlocher/arrow-rs that referenced this pull request Jan 13, 2025
svencowart pushed a commit to elastiflow/arrow-rs that referenced this pull request Jan 14, 2025
totoroyyb pushed a commit to totoroyyb/arrow-rs that referenced this pull request Jan 20, 2025
alamb added a commit to alamb/arrow-rs that referenced this pull request Mar 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-change Changes to the arrow API next-major-release the PR has API changes and it waiting on the next major version

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use 'static lifetime in BoxFuture for all object-store APIs

3 participants