ENH: Improve Mask-Based Patch Extraction Speed & Add Units Conversion #217

mostafajahanifar · 2021-12-14T11:06:35Z

In the current implementation, the filter_coordinates function is used to filter out generated coordinates based on the input mask. This is done by reading the mask region from mask_reader object (which is a VirtualWSI) at the same resolution and region requested in the patch extraction. Therefore, this process can take very long specially if the whole slide mask region is big and user requests to extract small patches from highest resolution (not to mention the increase in time due to overlap in patch extraction). For example, if you want extract 256x256 patches at 40x from the below WSI, it might take up to 10 minutes for checking filter the coordinates

In this PR, I have introduced filter_coordinates_fast which uses the overview of mask_reader object (a numpy array) at a fixed-low resolution (mpp=8 or power=1.25) to speed of the region mask reading process and hence improve the speed of coordinate filtering and whole patch extraction processes. For example, the above image takes about 0.5 seconds (at least 1,000x improvement in speed in comparison to previous method).

To be able to do this, I also introduce a method to WSIReader class that converts resolution from one unit to all other units, because the filter_coordinates_fast expects its input mask resolution to be in mpp, power, or baseline units.

Implement fast coordinate filtering function
Implement resolution conversion function in wsireader
Writing tests
Improve coverage

…ering

… x,y

…enhance-patchextraction

codecov · 2021-12-16T20:47:07Z

Codecov Report

Merging #217 (855880b) into develop (5f98e5b) will increase coverage by 0.03%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop     #217      +/-   ##
===========================================
+ Coverage    99.81%   99.85%   +0.03%     
===========================================
  Files           53       53              
  Lines         4870     5346     +476     
  Branches       800      924     +124     
===========================================
+ Hits          4861     5338     +477     
  Misses           2        2              
+ Partials         7        6       -1

Impacted Files	Coverage Δ
tiatoolbox/tools/patchextraction.py	`100.00% <100.00%> (ø)`
tiatoolbox/wsicore/wsireader.py	`99.60% <100.00%> (+0.47%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f08062f...855880b. Read the comment docs.

John-P · 2021-12-21T15:13:24Z

tiatoolbox/utils/transforms.py

    return np.add(bounds, padding * signs)
+
+
+def convert_resolution(input_res, input_unit, baseline_mpp=None, baseline_power=None):


Something doesn't seem right about this, but I need a while to properly go through it. Also, returning conversion to all units seems a bit like doing a lot of unnecessary work if only one conversion is being used.. Surely a conversion from one unit to another would be better.

Thanks @John-P , I hope you find what's not right about this sooner :-D

Re returning conversions to all units: first off, most of the units are needed to be calculated during the calculation of one of them so why not keep all of them?
Also, if we allow requesting for a specific output unit, then have added an unnecessary argument to the function whereas we should still write the function like this (calculate all units) and select the desired unit from the dictionary. Otherwise, the function would comprise many if statement to do the right conversion!

if input_unit is such and requested_unit is such: do this calculation

This part should be repeated 4x4=16 times (imaging writing tests for it)! At least, I don't know a better way to write it.
Not to mention, the whole thing really need a small amount of computation (it tooks 12.2 ns on my laptop). So I guess we don't need to be worry about it.
However, if you insist on returning only one desired unit, I'm happy to add another line of code the end of the function, extracting the desired unit from the output dictionary.

I notice that the output in terms of levels is not right. Levels do not always decrease by half each time. Some TIFF have a downsample of 4x (and it can be different between levels in the same image). To do this correctly, you'd need the information in WSIMeta.

…enhance-patchextraction

vqdang

Addess changes before merging. Resolution thingy needed to be dealed with by WSIMetadata. Also reading mask should be done as a different mode for coord_space (receiving coordinates at WSI target resolution but actual read done only at thumb resolution w/o any rescaling). These will need to be done via another PR given the urgency of finishing the paper.

vqdang · 2021-12-21T15:56:05Z

tiatoolbox/utils/transforms.py

+        "level": output_level,
+        "baseline": unit_baseline_scale,


Return a warning to stricly denote these values can totally be garbage. Or set to None to straight out crashing.

Default values for the parameters have been set to None. User/developer can check the values when they call the function according to their application.

tiatoolbox/utils/transforms.py

vqdang · 2021-12-21T15:58:32Z

tiatoolbox/utils/transforms.py

    return np.add(bounds, padding * signs)
+
+
+def convert_resolution(input_res, input_unit, baseline_mpp=None, baseline_power=None):


baseline => target to make it more clear, also use dictionary

please refer to my response to John above.

tiatoolbox/tools/patchextraction.py

tiatoolbox/utils/transforms.py

review-notebook-app · 2021-12-21T21:08:22Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

…enhance-patchextraction

…te docstrings

tiatoolbox/utils/transforms.py

mostafajahanifar · 2021-12-22T11:04:09Z

Thanks @vqdang for the review.

Re:

Resolution thingy needed to be dealed with by WSIMetadata.

I believe this function can be called in a WSIMeta method, where there is access to WSI metadata and we can control what data should be fed into the convert_resolution_units and check if the returned value are as expected. For example, in WSIMeta we should check if the sequence of level information stored in the WSIMeta are ordered (and raise a warning otherwise to inform the user that some processes may get wrong output).

Re:

Also reading mask should be done as a different mode for coord_space (receiving coordinates at WSI target resolution but actual read done only at thumb resolution w/o any rescaling)

I agree with you and it will be good to have such an option in future. However, for that we need to consider scaling the rect parameters in the read_rect function for VirtuaWSI and that need further information from the user in the input about the thumb resolution, resolutions being in the same unit, avoiding extraction of thumbnail several times, etc. So, as you said, it's the future problem!

…enhance-patchextraction

…units

…enhance-patchextraction

mostafajahanifar added 4 commits December 14, 2021 10:40

[skip travis] NEW: add convert_resolution before fast coordinate filt…

db37d6d

…ering

[skip travis] BUG: correct the situation with different resolution in…

aabd17f

… x,y

MAINT: simplify if statement

17b712d

Merge branch 'develop' of https://github.com/TIA-Lab/tiatoolbox into …

2450b0c

…enhance-patchextraction

shaneahmed requested review from John-P and vqdang and removed request for John-P December 17, 2021 09:27

shaneahmed assigned John-P, vqdang and mostafajahanifar Dec 17, 2021

shaneahmed requested a review from John-P December 17, 2021 09:28

John-P added the enhancement New feature or request label Dec 17, 2021

John-P changed the title ~~ENH: Improve mask-based patch extraction speed and add units conversion~~ ENH: Improve Mask-Based Patch Extraction Speed and Add Units Conversion Dec 17, 2021

John-P changed the title ~~ENH: Improve Mask-Based Patch Extraction Speed and Add Units Conversion~~ ENH: Improve Mask-Based Patch Extraction Speed & Add Units Conversion Dec 17, 2021

John-P reviewed Dec 21, 2021

View reviewed changes

Merge branch 'develop' of https://github.com/TIA-Lab/tiatoolbox into …

5620b9c

…enhance-patchextraction

vqdang suggested changes Dec 21, 2021

View reviewed changes

shaneahmed reviewed Dec 21, 2021

View reviewed changes

tiatoolbox/tools/patchextraction.py Show resolved Hide resolved

shaneahmed reviewed Dec 21, 2021

View reviewed changes

tiatoolbox/utils/transforms.py Outdated Show resolved Hide resolved

mostafajahanifar added 4 commits December 22, 2021 09:32

Merge branch 'develop' of https://github.com/TIA-Lab/tiatoolbox into …

9bcd73b

…enhance-patchextraction

MAINT: refactor filter_coorinate opeerations with numpy vectorization

d2d9ff2

API: change convert_resolution to convert_resolution_units and comple…

44defb0

…te docstrings

MAINT: improve coverage

c4f3471

mostafajahanifar force-pushed the enhance-patchextraction branch from b17bcf8 to c4f3471 Compare December 22, 2021 10:48

shaneahmed reviewed Dec 22, 2021

View reviewed changes

tiatoolbox/utils/transforms.py Outdated Show resolved Hide resolved

mostafajahanifar added 2 commits December 22, 2021 12:03

Merge branch 'develop' of https://github.com/TIA-Lab/tiatoolbox into …

3b7a48a

…enhance-patchextraction

API: move _convert_resolution_units to wsireader

82f1089

mostafajahanifar added 6 commits December 22, 2021 18:41

TST: add tests for _convert_resolution_units to test_wsireader

85ad057

ENH: add support of fast cordinate filtering with various resolution …

e9164d4

…units

MAINT: remove redundant test scripts

a409fdf

MAINT: correct DeepSource errors

250bd36

Merge branch 'develop' of https://github.com/TIA-Lab/tiatoolbox into …

3a94526

…enhance-patchextraction

BUG: correct unncessary import in wsireader

855880b

John-P approved these changes Dec 23, 2021

View reviewed changes

shaneahmed merged commit b4976c8 into develop Dec 23, 2021

shaneahmed deleted the enhance-patchextraction branch December 23, 2021 10:18

		return np.add(bounds, padding * signs)


		def convert_resolution(input_res, input_unit, baseline_mpp=None, baseline_power=None):

ENH: Improve Mask-Based Patch Extraction Speed & Add Units Conversion #217

ENH: Improve Mask-Based Patch Extraction Speed & Add Units Conversion #217

Uh oh!

Conversation

mostafajahanifar commented Dec 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Dec 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

John-P Dec 21, 2021

Choose a reason for hiding this comment

Uh oh!

mostafajahanifar Dec 22, 2021

Choose a reason for hiding this comment

Uh oh!

John-P Dec 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vqdang left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vqdang Dec 21, 2021

Choose a reason for hiding this comment

Uh oh!

mostafajahanifar Dec 22, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vqdang Dec 21, 2021

Choose a reason for hiding this comment

Uh oh!

mostafajahanifar Dec 22, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

review-notebook-app bot commented Dec 21, 2021

Uh oh!

Uh oh!

mostafajahanifar commented Dec 22, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mostafajahanifar commented Dec 14, 2021 •

edited

Loading

codecov bot commented Dec 16, 2021 •

edited

Loading

John-P Dec 22, 2021 •

edited

Loading

vqdang left a comment •

edited

Loading