Skip to content

Conversation

@mostafajahanifar
Copy link
Collaborator

@mostafajahanifar mostafajahanifar commented Dec 14, 2021

In the current implementation, the filter_coordinates function is used to filter out generated coordinates based on the input mask. This is done by reading the mask region from mask_reader object (which is a VirtualWSI) at the same resolution and region requested in the patch extraction. Therefore, this process can take very long specially if the whole slide mask region is big and user requests to extract small patches from highest resolution (not to mention the increase in time due to overlap in patch extraction). For example, if you want extract 256x256 patches at 40x from the below WSI, it might take up to 10 minutes for checking filter the coordinates

output
output_

In this PR, I have introduced filter_coordinates_fast which uses the overview of mask_reader object (a numpy array) at a fixed-low resolution (mpp=8 or power=1.25) to speed of the region mask reading process and hence improve the speed of coordinate filtering and whole patch extraction processes. For example, the above image takes about 0.5 seconds (at least 1,000x improvement in speed in comparison to previous method).

output__

To be able to do this, I also introduce a method to WSIReader class that converts resolution from one unit to all other units, because the filter_coordinates_fast expects its input mask resolution to be in mpp, power, or baseline units.

  • Implement fast coordinate filtering function
  • Implement resolution conversion function in wsireader
  • Writing tests
  • Improve coverage

@codecov
Copy link

codecov bot commented Dec 16, 2021

Codecov Report

Merging #217 (855880b) into develop (5f98e5b) will increase coverage by 0.03%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #217      +/-   ##
===========================================
+ Coverage    99.81%   99.85%   +0.03%     
===========================================
  Files           53       53              
  Lines         4870     5346     +476     
  Branches       800      924     +124     
===========================================
+ Hits          4861     5338     +477     
  Misses           2        2              
+ Partials         7        6       -1     
Impacted Files Coverage Δ
tiatoolbox/tools/patchextraction.py 100.00% <100.00%> (ø)
tiatoolbox/wsicore/wsireader.py 99.60% <100.00%> (+0.47%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f08062f...855880b. Read the comment docs.

@shaneahmed shaneahmed requested review from John-P and vqdang and removed request for John-P December 17, 2021 09:27
@shaneahmed shaneahmed requested a review from John-P December 17, 2021 09:28
@John-P John-P added the enhancement New feature or request label Dec 17, 2021
@John-P John-P changed the title ENH: Improve mask-based patch extraction speed and add units conversion ENH: Improve Mask-Based Patch Extraction Speed and Add Units Conversion Dec 17, 2021
@John-P John-P changed the title ENH: Improve Mask-Based Patch Extraction Speed and Add Units Conversion ENH: Improve Mask-Based Patch Extraction Speed & Add Units Conversion Dec 17, 2021
return np.add(bounds, padding * signs)


def convert_resolution(input_res, input_unit, baseline_mpp=None, baseline_power=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something doesn't seem right about this, but I need a while to properly go through it. Also, returning conversion to all units seems a bit like doing a lot of unnecessary work if only one conversion is being used.. Surely a conversion from one unit to another would be better.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @John-P , I hope you find what's not right about this sooner :-D

Re returning conversions to all units: first off, most of the units are needed to be calculated during the calculation of one of them so why not keep all of them?
Also, if we allow requesting for a specific output unit, then have added an unnecessary argument to the function whereas we should still write the function like this (calculate all units) and select the desired unit from the dictionary. Otherwise, the function would comprise many if statement to do the right conversion!

if input_unit is such and requested_unit is such:
    do this calculation

This part should be repeated 4x4=16 times (imaging writing tests for it)! At least, I don't know a better way to write it.
Not to mention, the whole thing really need a small amount of computation (it tooks 12.2 ns on my laptop). So I guess we don't need to be worry about it.
However, if you insist on returning only one desired unit, I'm happy to add another line of code the end of the function, extracting the desired unit from the output dictionary.

Copy link
Contributor

@John-P John-P Dec 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that the output in terms of levels is not right. Levels do not always decrease by half each time. Some TIFF have a downsample of 4x (and it can be different between levels in the same image). To do this correctly, you'd need the information in WSIMeta.

Copy link
Contributor

@vqdang vqdang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addess changes before merging. Resolution thingy needed to be dealed with by WSIMetadata. Also reading mask should be done as a different mode for coord_space (receiving coordinates at WSI target resolution but actual read done only at thumb resolution w/o any rescaling). These will need to be done via another PR given the urgency of finishing the paper.

Comment on lines 378 to 397
"level": output_level,
"baseline": unit_baseline_scale,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return a warning to stricly denote these values can totally be garbage. Or set to None to straight out crashing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default values for the parameters have been set to None. User/developer can check the values when they call the function according to their application.

return np.add(bounds, padding * signs)


def convert_resolution(input_res, input_unit, baseline_mpp=None, baseline_power=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

baseline => target to make it more clear, also use dictionary

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please refer to my response to John above.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@mostafajahanifar
Copy link
Collaborator Author

Thanks @vqdang for the review.

Re:

Resolution thingy needed to be dealed with by WSIMetadata.

I believe this function can be called in a WSIMeta method, where there is access to WSI metadata and we can control what data should be fed into the convert_resolution_units and check if the returned value are as expected. For example, in WSIMeta we should check if the sequence of level information stored in the WSIMeta are ordered (and raise a warning otherwise to inform the user that some processes may get wrong output).

Re:

Also reading mask should be done as a different mode for coord_space (receiving coordinates at WSI target resolution but actual read done only at thumb resolution w/o any rescaling)

I agree with you and it will be good to have such an option in future. However, for that we need to consider scaling the rect parameters in the read_rect function for VirtuaWSI and that need further information from the user in the input about the thumb resolution, resolutions being in the same unit, avoiding extraction of thumbnail several times, etc. So, as you said, it's the future problem!

@shaneahmed shaneahmed merged commit b4976c8 into develop Dec 23, 2021
@shaneahmed shaneahmed deleted the enhance-patchextraction branch December 23, 2021 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants