Skip to content

Add support for single-cell batch correction #1472

@arteymix

Description

@arteymix

Various methods exist for correcting batch effects on single-cell data.

The batch correction itself can be done externally and the resulting data can be re-imported in Gemma as a preferred set of single-cell vectors or be done entirely within Gemma. If done within Gemma, I'd advocate for an external R process with temporary scratch files instead of using REngine.

In the 1.32.2, I've added a CLI tool to generate Cell Browser-compatible metadata. This can be used in addition to the MEX output to apply pretty much any existing method. It might be preferable though to have our own format that separates the assay ID from the cell ID.

  • support exporting the experimental design with cell-level characteristics
  • support loading vectors into an existing single-cell dimension

We can reuse ComBat implementation, but we have to convert the expression data matrix and the design matrix into baseCode's sparse matrices.

There are other methods available such as Harmony and LIGER. We might have to evaluate which is more adequate for our workflow.

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions