-
-
Notifications
You must be signed in to change notification settings - Fork 8
feat: Implement the SortBy stream adapter
#43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The motivation behind this patch is to replace `arrayvec` by `smallvec`. Why? Because `arrayvec` cannot grow, it's an array-baked vector. Once the full capacity is reached, it cannot grow any more. That was fine up until now because it was only used by the `Limit` stream adapter which needs only 2 values in a buffer that is typed by an `ArrayVec`. But for other stream adapters, it's a limitation. What if a stream adapter needs a buffer of more than 2 values? Ideally, we want to preserve the stack-allocated property for obvious performance reasons (cache locality, reducing allocator traffic and so on). However, we also want to be able to fallback to the heap for larger allocations. `smallvec` provides this feature. Hence, it does what `arrayvec` does, with the addition of heap-allocated fallback when it's needed.
This type alias is renamed to `VectorDiffContainerStreamBuffer`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few general comments before I dive into the implementation:
- I don't like that
Limitnow usesSmallVec. I'd prefer to rename the associatedBuffertype back toLimitBufand introduce a separateSortBufthat's either aSmallVecor just a regularVecfor the non-batching subscriber. If you consider theSmallVecusage important I don't mind the extra dependency, butLimitshould continue usingArrayVecIMO. In that case I have a slight preference for usingSmallVec<[T; 4]>overSmallVec<[T; 2]>but that's just my personal feeling about what might be a reasonable cut-off for stack vs. heap, feel free to ignore it. - The name
sortis used instdand other libraries to sort based on the element type'sOrdimplementation. The closure-taking form is calledsort_by, so the same name should be used here. We can keepsortfor the module name though and reuse most of the implementation for aSortadapter that does not take a sorting function (andSortByKeythat takes a key function) later.
jplatte
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rough first review
| VectorDiffContainerStreamElement, | ||
| }; | ||
|
|
||
| type UnsortedIndex = usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Filter adapter, the "index of the element in the input vector" is called the original index. Would you mind using the same name here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe the naming should be identical in all stream adapters. The semantics take the priority over the consistency in some cases, and I reckon this is one of the cases. What matters —to me— is that we have facing an index that represented the unsorted position of the value. Whether it is the original or not the original index isn't really important. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the semantics are what's important definitely. I still think it's a clearer name. The input / original list could already be sorted ;)
I'd also be okay with input / output index naming (for all adapters that re-order or filter items), how do you feel about that option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want all stream adapters to use the same terminology?
`Sort` is `VectorDiff` stream adapter that presents a sorted view of the
underlying `ObservableVector` items.
```rust
use eyeball_im::{ObservableVector, VectorDiff};
use eyeball_im_util::vector::VectorObserverExt;
use imbl::vector;
use std::cmp::Ordering;
use stream_assert::{assert_closed, assert_next_eq, assert_pending};
// A comparison function that is used to sort our
// `ObservableVector` values.
fn cmp<T>(left: &T, right: &T) -> Ordering
where
T: Ord,
{
left.cmp(right)
}
// Our vector.
let mut ob = ObservableVector::<char>::new();
let (values, mut sub) = ob.subscribe().sort(cmp);
// ^^^
// | our comparison function
assert!(values.is_empty());
assert_pending!(sub);
// Append multiple unsorted values.
ob.append(vector!['d', 'b', 'e']);
// We get a `VectorDiff::Append` with sorted values!
assert_next_eq!(sub, VectorDiff::Append { values: vector!['b', 'd', 'e'] });
// Let's recap what we have. `ob` is our `ObservableVector`,
// `sub` is the “sorted view”/“sorted stream” of `ob`:
// | `ob` | d b e |
// | `sub` | b d e |
// Append other multiple values.
ob.append(vector!['f', 'g', 'a', 'c']);
// We get three `VectorDiff`s!
assert_next_eq!(sub, VectorDiff::PushFront { value: 'a' });
assert_next_eq!(sub, VectorDiff::Insert { index: 2, value: 'c' });
assert_next_eq!(sub, VectorDiff::Append { values: vector!['f', 'g'] });
// Let's recap what we have:
// | `ob` | d b e f g a c |
// | `sub` | a b c d e f g |
// ^ ^ ^^^
// | | |
// | | with `VectorDiff::Append { .. }`
// | with `VectorDiff::Insert { index: 2, .. }`
// with `VectorDiff::PushFront { .. }`
// Technically, `Sort` emits `VectorDiff` that mimics a sorted `Vector`.
drop(ob);
assert_closed!(sub);
```
Sort stream adapterSortBy stream adapter
|
I've written a long comment but it's been deleted… Damn… I'll try to summarize 😛 .
So. Inside Inside
We can make To conclude, I don't see the benefit of using |
|
Just a note that I’m improving the main “find the position” algorithm to use a binary search. It’s coming soon. |
I know. I like the simplicity of a fixed capacity where nothing more is necessary. I still prefer keeping |
This patch uses a binary search when the position of a value must be found. Finding a new value based on the `compare` function can be costly, hence the use of a binary search.
jplatte
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take care of the ArrayVec / SmallVec difference separately from this PR.
|
Released as part of eyeball-im-util 0.5.3. |
SortisVectorDiffstream adapter that presents a sorted view of the underlyingObservableVectoritems. A demonstration:The code above demonstrates how a
VectorDiff::Appendis handled, but all otherVectorDiff's variants produce variousVectorDiff. I've tried to generateVectorDiffthat are as much correct as possible regarding the semantics the user might expect. For example, if aVectorDiff::Insert { index: 42, … }is received but the value will be inserted at the (sorted) index 0, aVectorDiff::PushFrontwill be emitted instead of aVectorDiff::Insert { index: 0, … }.The code is extensively documented. I hope it helps to understand how this is implemented. It's pretty basic actually. The only difficulty is that it breaks your brain because you've to constantly remind: “Is this index for the sorted or the unsorted position”? I've tried to use consistent and clear naming for variables.