Non-propogation, `skipmissing`-related improvements to Missing handling. 

There is broad consensus that `missing` handling could be improved. Many discussions focus on making propagation of `missing`s easier, and those discussions are worth having, but I also want to focus on how `skipmissing` handling could be improved. Here are my suggestions. There is a lot of overlap with #30596 here but this discussion should focus more on building ideas and a roadmap rather than a specific implementation. 

1. Make skipmissing work for multiple iterators, returning a `Tuple` of iterators. I am working on a [PR](https://github.com/JuliaData/Missings.jl/pull/111) in Missings.jl to make this work. This will make it easier to work with vectors with mismatched values. This is especially useful for plotting. 
2. Make way more functions in Base like `cor`, accept any iterator and not vectors so we don't need to collect `skipmissing`s. 
3. Overload Zip so that we can zip together two vectors with missing elements and iterate over non-missing pairs. Unlike `skipmissings` above in (1), this returns an iterator of tuples. both are useful. 
4. Change broadcasting so that we can go from a `skipmissing` back to a vector with missings in the same locations. It would be nice for `skipmissing` to have some kind of persistence so that you don't lose the location of `missing`s when you collect. This allows you to, say, de-mean (or a more complicated function) elements of a vector with respect to non-missing entries. 
5. Use dispatch for DataFrames (or Tables or NamedTuples etc.) to simulate Stata's if syntax, where the new dataframe is a view into the non-missing elements  uses 4. (above) to fill in missings where needed. R doesn't have this feature so I don't think its obvious to everyone that this is an option. Stata is really great with this, you can do 

```
egen x = y - mean(z) if !missing(v)
```

And it will apply a filter on everything at the *start* of the function.

These are concrete changes that can be made without using relying on propagation of `missing`s. They would lead to a workflow where one is able to take a vector, filter it to remove missings in whatever way you like, do things with the vector (the hard part), and then keep the `missing`s in the correct locations. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Non-propogation, `skipmissing`-related improvements to Missing handling. #35050

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Uh oh!

Non-propogation, skipmissing-related improvements to Missing handling. #35050

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Non-propogation, `skipmissing`-related improvements to Missing handling. #35050