-
Notifications
You must be signed in to change notification settings - Fork 53
Feature/quant refactor #181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Feature/quant refactor #181
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall - only complaints are the additional traits.
| } | ||
| } | ||
|
|
||
| pub trait QuantifiableFeature: Clone { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there plans to implement this trait for other data types? I understand the desire to add some polymorphism, but if there is only a single implementer it seems a bit unnecessary to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in Sage itself, but in our case we call Sage as a library and implement the trait on our own features. Our internal features have several different fields than those in Sage (as they can use other identification algorithms). Rather than copying our internal features into sage features and having to mock multiple columns as we do not have data for them (e.g. poisson scores), the Quant and Align trait allows us to use Sage as natively as possible on our own data. See also the last section in the PR description above. The let mut features = experiment.get_features(); is where we of course have implemented the trait on our own custom features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a pretty exotic use-case, considering that Sage's LFQ implementation is not particularly advanced. I would prefer for you to mock multiple columns and just impl Into<Feature> etc on your own structs, rather than introducing some extra complexity and abstraction into the codebase.
| } | ||
|
|
||
| alignments | ||
| pub trait AlignableFeature: Send + Sync { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above comment about QuantifiableFeature
| log::info!("performing LFQ without MBR"); | ||
| let mut final_areas = FnvHashMap::default(); | ||
| // MS1Spectra, features and aligments are assumed to come from the same files and all be not empty | ||
| for (file_id, (local_ms1_spectra, local_features)) in ms1_spectra |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an elegant way of doing it while still maintaining back-compatibility (I am also OK with breaking it).
I think we can do this without cloning features and collecting into intermediate Vecs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not convinced about elegant, it is a clunky implementation but it is indeed backwards compatible (I purposefully set it up like that) and shoul be fully functional. I agree we can probably do this far more efficiently. Especially for Bruker data I would/could/should change up the way this is called significantly, but I can imagine that you want to retain some neutrality in sage with regards to vendor. That said, If you want me to look into improving this for at least the Bruker case, I am more than happy to provide you with a reference implementation that can serve as inspiration for mzml as well
Brief summary:
predict_rtandlfqflags. Here we, disentangled and added a specificalign_rtandpredict_imflag.lfqandrt_alignmentmodules to work with FeatureTraits rather than Features, such that they can easily be called seperately from outside of sage.mbrif required.New/changed input parameters
predict_rt: This previously controlled alignment, im prediction and rt prediction. Furthermore it was a prerequisite for lfq. This has been refactored to be a stand alone option. It purely triggers rt prediction now, which internally only ends up being used for FDR calculations (due todelta_rt)predict_im: Same as the newpredict_rtbut then for im. Also only ends up being used in FDR calculations due todelta_imalign_rt: Purely responsible to trigger rt alignment. Note that if disabled, anrt_alignedis still calculated for every feature, but this is done on a per run basis and as such only rescales rt values to be aligned_rt` values within the range [0,1]. As such, this is essentially just a per run normalization of the rt.mbr(within the/quant/lfq_settingsgroup: This does quantification on a per run basis only, i.e. only quantifies a feature if it is identified within a runpredict_rtis enabled,align_rtwill always/automatically be enabled too!mbrandlfqare enabled,align_rtwill always/automatically be enabled too!New traits:
AlignableFeature: A minimal set of fields required to be able to perform rt alignments.QuantifiableFeature: A minimal set of fields required to be able to perform lfq quantification.CompositionDatabaseA convenience tratit to be able to calculate aCompositionfor aQuantifiableFeature.New functionality:
An example of what we want to achieve with this PR is to do e.g. the following outside of sage with a custom set of identified features: