Patch extraction pool classes¶
dplabtools includes a set of Pool classes providing a parallel execution interface for classes
described in Patch Extraction. Using pool classes for extracting patches allows for the processing of
multiple WSIs at the same time and combining the results either in memory or by saving to disk. Patch extraction
pool classes have been designed to accept Patch locations/sampling pool classes as their input, however, they could also
extract patches calculated by other sources.
Warning
Currently MemPatchExtractorPool and MultiResMemPatchExtractorPool suffer from a performance degradation
related to the default inter-process communication present in Python. While both classes deliver correct results,
their use in large scale experiments is not recommended. This part of the package is currently pending a rewrite using a
shared memory model.
In memory patch extraction pool¶
MemPatchExtractorPool is a class for parallel extraction of patches which will reside in memory.
Basic usage¶
Assuming that patches_pool represents an object from one of the Patch locations/sampling pool classes classes, the following
example will create a list of in-memory patches from multiple WSIs and will print them on the screen:
from dplabtools.slides.patches import MemPatchExtractorPool
extractor_pool = MemPatchExtractorPool(
patches_pool=patches_pool,
thread_num_workers=2,
proc_num_workers=3,
)
for patch in extractor_pool.patch_list:
print(patch)
Class details¶
In memory patch extraction pool (MRP)¶
MultiResMemPatchExtractorPool is a class for parallel extraction of multi resolution patches which will reside in memory.
See also
Basic usage¶
Assuming that patches_pool represents an object from one of the Patch locations/sampling pool classes classes, the following
example will create a list of in-memory multi resolution patches from multiple WSIs and will print them on the screen:
from dplabtools.slides.patches import MultiResMemPatchExtractorPool
extractor_pool = MultiResMemPatchExtractorPool(
patches_pool=patches_pool,
levels_or_mpps=[2, 1, 0],
thread_num_workers=2,
proc_num_workers=3,
)
for multires_patch in extractor_pool.patch_list:
for patch in multires_patch:
print(patch)
Class details¶
- class dplabtools.slides.patches.MultiResMemPatchExtractorPool(...)¶
Extractor pool implementation for MultiResMemPatchExtractor.
- property patch_count¶
Return the number of extracted patches.
- property patch_list¶
Return the extracted patches stored in memory.
- property patchset_count¶
Return the number of patch sets created during the extraction.
- property pids¶
Return the IDs of the executed processes.
Parameters specific to MultiResMemPatchExtractorPool:
- class dplabtoolshiddenclass_3d58868fc75b4857969751593cf691e8
- Parameters:
levels_or_mpps (list of level_or_mpp values) – Int or Float numbers representing WSI levels or MPP values for multi resolution patches.
To disk patch extraction pool¶
DiskPatchExtractorPool is a class for parallel extraction of patches which will be saved to disk.
Basic usage¶
Assuming that patches_pool represents an object from one of the Patch locations/sampling pool classes classes, the following
example will save extracted patches from multiple WSIs into /tmp directory:
from dplabtools.slides.patches import DiskPatchExtractorPool
extractor_pool = DiskPatchExtractorPool(
patches_pool=patches_pool,
output_dir="/tmp",
image_type="png",
thread_num_workers=2,
proc_num_workers=3,
)
Class details¶
To disk patch extraction pool (MRP)¶
MultiResDiskPatchExtractorPool is a class for parallel extraction of multi resolution patches which will be saved to disk.
See also
Basic usage¶
Assuming that patches_pool represents an object from one of the Patch locations/sampling pool classes classes, the following
example will save sets of extracted patches from multiple WSIs into the /tmp directory:
from dplabtools.slides.patches import MultiResDiskPatchExtractorPool
extractor_pool = DiskPatchExtractorPool(
patches_pool=patches_pool,
levels_or_mpps=[0, 1],
output_dir="/tmp",
image_type="png",
thread_num_workers=2,
proc_num_workers=3,
)
Class details¶
- class dplabtools.slides.patches.MultiResDiskPatchExtractorPool(...)¶
Extractor pool implementation for MultiResDiskPatchExtractor.
- property manifest_ids¶
Return the IDs of the created manifests.
- property patch_count¶
Return the number of extracted patches.
- property patchset_count¶
Return the number of patch sets created during the extraction.
- property pids¶
Return the IDs of the executed processes.
Parameters specific to MultiResMemPatchExtractorPool:
- class dplabtoolshiddenclass_b51da6e658804e11b08f7457174e4664
- Parameters:
levels_or_mpps (list of level_or_mpp values) – Int or Float numbers representing WSI levels or MPP values for multi resolution patches.
global_counter (int, default=1) – Initial counter value for enumerating patch set directories (set1, set2, set3, …) for an entire collection of WSIs. Setting this value to None will cause the patch set counter to be reset for each WSI (wsi1_set1, wsi1_set2, … wsi2_set1, wsi2_set2, …).
Parameters common to disk patch extraction pool classes¶
- class dplabtoolshiddenclass_8c6f157e09f14815bf678c24f13de596
- Parameters:
output_dir (str) – Directory name or path for saving the extracted patches.
image_type (str) – Image type of the saved files (PNG, JPG, etc.).
filename_comment (str, optional) – Comment to be added to the saved file names.
filename_separator (str, default="_") – Separator used in the saved file names.
create_subdirs (bool, default=False) – Whether to create label specific subdirectories inside output_dir or not.
Parameters common to all patch extraction pool classes¶
- class dplabtoolshiddenclass_eb5261159cbe4c4ab0f062d5053d5b4a
- Parameters:
patches_pool (object) – Object representing one of the patch location pool classes.
proc_num_workers (int) – Number of processes in the pool. This value corresponds directly to the number of WSIs to be processed simultaneously.
thread_num_workers (int) – Number of threads per one worker process. This value indicates how many threads will be used to extract patches from a single WSI.
proc_mp_chunksize (int, default=1) – Data chunk size used in process parallelization processing.
thread_mp_chunksize (int, default=1) – Data chunk size used in thread parallelization processing.
resampling_mode (str, optional) – One of two supported down/up-sampling methods:
wsiortileincluded_labels (list of str, optional) – Polygon labels included in patch extraction, all other labels will be ignored.
excluded_labels (list of str, optional) – Polygon labels excluded from patch extraction.