Feature/quant refactor #181

sander-willems-bruker · 2025-04-17T08:55:00Z

Brief summary:

This PR is mostly a refactor of feature alignment and (lfq) quantification. Previously, this was all triggered by the predict_rt and lfq flags. Here we, disentangled and added a specific align_rt and predict_im flag.
Furthermore, we allow the lfq and rt_alignment modules to work with FeatureTraits rather than Features, such that they can easily be called seperately from outside of sage.
Finally, we added the option to disable mbr if required.

New/changed input parameters

changed predict_rt: This previously controlled alignment, im prediction and rt prediction. Furthermore it was a prerequisite for lfq. This has been refactored to be a stand alone option. It purely triggers rt prediction now, which internally only ends up being used for FDR calculations (due to delta_rt)
new predict_im: Same as the new predict_rt but then for im. Also only ends up being used in FDR calculations due to delta_im
new align_rt: Purely responsible to trigger rt alignment. Note that if disabled, an rt_aligned is still calculated for every feature, but this is done on a per run basis and as such only rescales rt values to be aligned_rt` values within the range [0,1]. As such, this is essentially just a per run normalization of the rt.
new mbr (within the /quant/lfq_settings group: This does quantification on a per run basis only, i.e. only quantifies a feature if it is identified within a run
NOTES:
- When predict_rt is enabled, align_rt will always/automatically be enabled too!
- When mbr and lfq are enabled, align_rt will always/automatically be enabled too!
- The above always/automatically enabling is not strictly required, but not doing so will likely lead to poor(er) results.

New traits:

AlignableFeature: A minimal set of fields required to be able to perform rt alignments.
QuantifiableFeature: A minimal set of fields required to be able to perform lfq quantification.
CompositionDatabase A convenience tratit to be able to calculate a Composition for a QuantifiableFeature.

New functionality:

Option to disable match between runs. This is implemented somewhat inefficient to make sure that if MBR is enabled it is 100% backwards compatible with previous sage.
The new trais allow to easier control alignment and quant from outside of sage.

An example of what we want to achieve with this PR is to do e.g. the following outside of sage with a custom set of identified features:

    let mut features = experiment.get_features();
    let database = database::Database::from_features(&features);
    let alignments =
        sage_core::ml::retention_alignment::global_alignment(&mut features, experiment.len());
    let mut areas = sage_core::lfq::quantify(
        &lfq_settings,
        precursor_charge,
        &database,
        &features,
        &ms1_spectra,
        alignments,
    );

… option

…e itself

lazear

Looks good overall - only complaints are the additional traits.

lazear · 2025-05-06T17:07:30Z

crates/sage/src/lfq.rs

+    }
+}
+
+pub trait QuantifiableFeature: Clone {


Are there plans to implement this trait for other data types? I understand the desire to add some polymorphism, but if there is only a single implementer it seems a bit unnecessary to me.

Not in Sage itself, but in our case we call Sage as a library and implement the trait on our own features. Our internal features have several different fields than those in Sage (as they can use other identification algorithms). Rather than copying our internal features into sage features and having to mock multiple columns as we do not have data for them (e.g. poisson scores), the Quant and Align trait allows us to use Sage as natively as possible on our own data. See also the last section in the PR description above. The let mut features = experiment.get_features(); is where we of course have implemented the trait on our own custom features.

I think it's a pretty exotic use-case, considering that Sage's LFQ implementation is not particularly advanced. I would prefer for you to mock multiple columns and just impl Into<Feature> etc on your own structs, rather than introducing some extra complexity and abstraction into the codebase.

lazear · 2025-05-06T17:08:36Z

crates/sage/src/ml/retention_alignment.rs

+}

-    alignments
+pub trait AlignableFeature: Send + Sync {


See above comment about QuantifiableFeature

lazear · 2025-05-06T17:12:26Z

crates/sage/src/lfq.rs

+        log::info!("performing LFQ without MBR");
+        let mut final_areas = FnvHashMap::default();
+        // MS1Spectra, features and aligments are assumed to come from the same files and all be not empty
+        for (file_id, (local_ms1_spectra, local_features)) in ms1_spectra


This is an elegant way of doing it while still maintaining back-compatibility (I am also OK with breaking it).
I think we can do this without cloning features and collecting into intermediate Vecs.

I am not convinced about elegant, it is a clunky implementation but it is indeed backwards compatible (I purposefully set it up like that) and shoul be fully functional. I agree we can probably do this far more efficiently. Especially for Bruker data I would/could/should change up the way this is called significantly, but I can imagine that you want to retain some neutrality in sage with regards to vendor. That said, If you want me to look into improving this for at least the Bruker case, I am more than happy to provide you with a reference implementation that can serve as inspiration for mzml as well

sander-willems-bruker added 12 commits April 16, 2025 14:21

CHORE: remove warnings

e2d56de

CHORE: added parquet files to gitignore

136865f

FEAT: introduce align_rt and mbr option to deconvolute the predict_rt…

2cc0b37

… option

FEAT: updated runner to take align_rt, and predict_rt into account

bc440ee

FEAT: opened up mz1/msn tdf reader for outside usage

0a4874e

FEAT: made rt_alignment work with a feature trait rather thatn featur…

1185edf

…e itself

CHORE: type in brukerms1centroidingconfig

c06b696

FEAT: made quantifiabletrait for lfq in combination with dbtrait

c6fdef9

CHORE: made spectrumprocesser with mobility easier available

d633205

FEAT: dirty implementation for no mbr option

c9ad708

CHORE: cleanup outcommented code

2dcd12e

FEAT: separate predict_im trigger

4f1c660

jspaezp mentioned this pull request Apr 21, 2025

refactor: Move Peaks from AOS to SOA #178

Draft

lazear reviewed May 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/quant refactor #181

Feature/quant refactor #181

Uh oh!

sander-willems-bruker commented Apr 17, 2025 •

edited

Loading

Uh oh!

lazear left a comment

Uh oh!

lazear May 6, 2025

Uh oh!

sander-willems-bruker May 7, 2025 •

edited

Loading

Uh oh!

lazear May 12, 2025

Uh oh!

lazear May 6, 2025

Uh oh!

lazear May 6, 2025

Uh oh!

sander-willems-bruker May 7, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Feature/quant refactor #181

Are you sure you want to change the base?

Feature/quant refactor #181

Uh oh!

Conversation

sander-willems-bruker commented Apr 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lazear left a comment

Choose a reason for hiding this comment

Uh oh!

lazear May 6, 2025

Choose a reason for hiding this comment

Uh oh!

sander-willems-bruker May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lazear May 12, 2025

Choose a reason for hiding this comment

Uh oh!

lazear May 6, 2025

Choose a reason for hiding this comment

Uh oh!

lazear May 6, 2025

Choose a reason for hiding this comment

Uh oh!

sander-willems-bruker May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sander-willems-bruker commented Apr 17, 2025 •

edited

Loading

sander-willems-bruker May 7, 2025 •

edited

Loading

sander-willems-bruker May 7, 2025 •

edited

Loading