Releases: Integrative-Transcriptomics/MUSIAL
Releases · Integrative-Transcriptomics/MUSIAL
MUSIAL-v2.4
v2.4.2 (Minor Update, 13.10.2025)
- Update Gradle to version 9.1.0.
- Improvements to the
sequencetask: Sequences can now be stored per sample, per feature, or for both. The performance of sequence export has been improved when gaps should not be added (no alignment case). - Multiple minor bug fixes:
- Added check for missing alternative in SnpEff annotation.
- Handling of DP4 coverage annotations is now fixed.
- The temporary directory is now correctly deleted if no task is executed.
v2.4.2 (Minor Update, 27.09.2025)
- Performance and Memory Optimization: String operations were streamlined and a faster navigable map implementation (btreemap) introduced, leading to reduced runtime and memory usage. For large-scale datasets, MUSIAL now integrates a local caching system (Ehcache) to back memory-heavy variant call processing.
- Support for HDBSCAN* clustering of alleles and proteoforms (introduced in v2.4.1 via Tribuo) has been rolled back due to performance and usability constraints. This functionality is planned for a dedicated future workflow.
- The handling of complex variant calls was refined for improved analysis accuracy. In addition, handling of additional file formats was implemented.
- Data Export Enhancements: Storage-to-table exports were re-implemented using Tablesaw, providing a more flexible and performant tabular output. Additionally, sequence and variant profile exports per sample have been separated into a distinct task
profilefor improved usability and workflow clarity. - Codebase Restructuring: The software was reorganized into distinct packages to improve maintainability and logical separation of responsibilities. Key components now cover: CLI parsing and validation, genomics data model, task execution, operations on the model independent of individual tasks, utility methods.
v2.4.1 (Minor Update, 23.05.2025)
- Implementation of HDBSCAN* clustering of alleles and proteoforms of features with Tribuo: After inference of the allele and proteoform sequences per sample, these are now clustered using the Tribuo library's HDBSCAN* algorithm to increase the interpretability of the data, i.e. samples that fall into the same clusters in terms of features can be considered similar even if they do not have the exact same set of variants in terms of features. Clustering is done using L1 distance based on binary features represented by all available variants (position & alternative content) of the feature - this means in particular that clustering is not stable across different sets of variants.
- The clustering results are used to generate informative names for alleles and proteoforms: these names have been adapted to be used in the different output formats.
- Improved naming convention for output files.
MUSIAL-v2.3.10
v.2.3.0
- Added container classes to increase abstraction.
- Increased efficiency of computations and memory usage.
- All variant calls are now stored in the build output, including reference. This reduces reference bias by rejecting calls.
- Simplified input parameters (call quality is no longer considered).
- Removed deprecated/redundant code.
v2.3.1 (Minor Update, 21.05.2024)
- Improved compatibility with bcftools called variants.
- Switched to gzip for compression to remove OS dependent errors.
- Removed deprecated/redundant code.
v2.3.2 (Minor Update, 03.06.2024)
- Added option to the build task to specify temporary working directory used by SnpEff.
- Rejected variants are now indicated by a
?symbol instead of!.
v2.3.3 (Minor Update, 05.06.2024)
- The option
-uof the build task can now be used to write uncompressed storage files. - Improved resolution of complex InDels.
- Removed skipping of non-haploid samples, still, all input files will be considered as haploid.
v2.3.4 (Minor Update, 14.06.2024)
- Bugfix for aminoacid variant inference.
v2.3.5 (Minor Update, 26.06.2024)
- Bugfix for aminoacid and nucleotide variant inference, i.e., in some scenarios InDels were treated as reference calls and ignored in downstream processing.
v2.3.6 (Minor Update, 28.06.2024)
- Bugfix for processing deletions that exceed feature lengths during proteoform inference.
v2.3.7 (Minor Update, 10.07.2024)
- Add option to exclude, in addition to positions, explicit variants from the analysis (to tackle reference errors).
v2.3.8 (Minor Update, 26.07.2024)
- Reference allele/proteoform is now stored independent of occurrence in samples.
- Separator symbol ";" replaced by ",".
- Extended nucleotide-variant storage logic to comprise variants that pass the filter criteria but are not the most frequent in one sample:
- Nucleotide variants stored for a feature are, in this sense, either (i) the most frequent allele passing filter criteria, (ii) the most frequent allele failing filter criteria, i.e., an ambiguous call, or (iii) not the most frequent allele, but passing filter criteria, i.e., non-primary variants.
- This is reflected by a primary attribute in the variant information.
- Sequence and table export still only consider primary/most frequent variants.
- Primary variants (derived of the most frequent allele of a call) that do not pass filter criteria are considered ambiguous variants and are now stored as SNVs with content N instead of Ns in the length of possible InDels to reduce bias towards actual variants.
- Actual variant content is now stored for ambiguous variants.
v2.3.9 (Minor Update, 16.08.2024)
- Bug fixes:
- The .fai index files for the reference sequence are now overwritten on each run to avoid supposed errors due to changes in the file contents.
- All do-while loops have been replaced by while loops to avoid errors due to empty iterators; this only happened in the case that no reference allele was present in the sample input.
- Fixed an error in the logical operation for filtering variants.
- SnpEff now annotates ambiguous variants, i.e. filtered variants, with respect to their actual alternative nucleotide content and extended the annotation to variants that are not on coding genes.
v2.3.10 (Minor Update, 10.09.2024)
- SnpEff has been updated to the latest version and minor parameter changes have been made to increase the efficiency of the SnpEff runtime.