Patch Extraction¶
Note
For high volume patch extraction, dplabtools offers dedicated Pool classes.
dplaptools provides a set of patch extraction classes, which integrate with the Patch locations/sampling
classes. Extracted patches can be saved to disk or stored in memory.
Other class features include:
Parallel patch extraction from a single WSI using multi-threading.
Support for multi resolution patches (MRP).
Patch filtering using arbitrary labels.
For patches saved to disk: automated manifest files creation and built-in file count checks.
In memory patch extraction¶
MemPatchExtractor is a class designed to perform the extraction of patches which will reside in memory.
Basic usage¶
Assuming that patches represents an object of one of the Patch locations/sampling classes, the following
example will create a stream of in-memory patches and will print them on the screen:
from dplabtools.slides.patches import MemPatchExtractor
extractor = MemPatchExtractor(
patches=patches,
num_workers=4,
)
for patch in extractor.patch_stream:
print(patch)
Class details¶
- class dplabtools.slides.patches.MemPatchExtractor(...)¶
Class for extracting in-memory patches.
- property patch_count¶
Return the number of extracted patches.
- property patch_data¶
Return the patch data used in the patch extraction process.
- property patch_labels¶
Return the distinct patch labels used in the patch extraction process.
- property patch_stream¶
Return a stream of memory images (an iterable object).
In memory patch extraction (MRP)¶
MultiResMemPatchExtractor is a class designed to perform the extraction of multi resolution patches which will
reside in memory.
See also
Basic usage¶
Assuming that patches represents an object of one of the Patch locations/sampling classes, the following example
will create a stream of sets of in-memory patches and will print them on the screen:
from dplabtools.slides.patches import MultiResMemPatchExtractor
extractor = MultiResMemPatchExtractor(
patches=patches,
levels_or_mpps=[0, 1],
num_workers=4,
)
for multires_patch in extractor.patch_stream:
for patch in multires_patch:
print(patch)
Class details¶
- class dplabtools.slides.patches.MultiResMemPatchExtractor(...)¶
Class for extracting in-memory multi resolution patches.
- property patch_count¶
Return the number of extracted patches.
- property patch_data¶
Return the patch data used in the patch extraction process.
- property patch_labels¶
Return the distinct patch labels used in the patch extraction process.
- property patch_stream¶
Return a stream of memory images (an iterable object).
Parameters specific to MultiResMemPatchExtractor:
- class dplabtoolshiddenclass_57ad80bf96c44b94841f338cb5def5cb
- Parameters:
levels_or_mpps (list of level_or_mpp values) – Int or Float numbers representing WSI levels or MPP values for multi resolution patches.
See also
To disk patch extraction¶
DiskPatchExtractor is a class designed to perform the extraction of patches which will be saved to disk.
Basic usage¶
Assuming that patches represents an object of one of the Patch locations/sampling classes, the following
example will save extracted patches into the /tmp directory:
from dplabtools.slides.patches import DiskPatchExtractor
extractor = DiskPatchExtractor(
patches=patches,
output_dir="/tmp",
image_type="png",
num_workers=4,
)
Class details¶
- class dplabtools.slides.patches.DiskPatchExtractor(...)¶
Class for extracting patches to disk.
- property manifest_id¶
Return the current manifest ID.
- property patch_count¶
Return the number of extracted patches.
- property patch_data¶
Return the patch data used in the patch extraction process.
- property patch_labels¶
Return the distinct patch labels used in the patch extraction process.
To disk patch extraction (MRP)¶
MultiResDiskPatchExtractor is a class designed to perform the extraction of multi resolution patches which
will be saved to disk.
Note
Each set of patches will be saved into a dedicated subdirectory.
See also
Basic usage¶
Assuming that patches represents an object of one of the Patch locations/sampling classes, the following
example will save sets of extracted patches into the /tmp directory:
from dplabtools.slides.patches import MultiResDiskPatchExtractor
extractor = MultiResDiskPatchExtractor(
patches=patches,
levels_or_mpps=[0, 1],
output_dir="/tmp",
image_type="png",
num_workers=4,
)
Class details¶
- class dplabtools.slides.patches.MultiResDiskPatchExtractor(...)¶
Class for extracting multi resolution patches to disk.
- property manifest_id¶
Return the current manifest ID.
- property patch_count¶
Return the number of extracted patches.
- property patch_data¶
Return the patch data used in the patch extraction process.
- property patch_labels¶
Return the distinct patch labels used in the patch extraction process.
- property patchset_counter¶
Return the last used value for the global patch counter.
Parameters specific to MultiResDiskPatchExtractor:
- class dplabtoolshiddenclass_699b8fadf232400495089b0d373955b1
- Parameters:
levels_or_mpps (list of level_or_mpp values) – Int or Float numbers representing WSI levels or MPP values for multi resolution patches.
global_counter (int, default=1) – Initial counter value for enumerating patch set directories (set1, set2, set3, …) for an entire collection of WSIs. Setting this value to None will cause the patch set counter to be reset for each WSI (wsi1_set1, wsi1_set2, … wsi2_set1, wsi2_set2, …).
Parameters common to disk patch extraction classes¶
- class dplabtoolshiddenclass_83c402346cd14762b8f60f614ada5351
- Parameters:
output_dir (str) – Directory name or path for saving the extracted patches.
image_type (str) – Image type of the saved files (PNG, JPG, etc.).
filename_comment (str, optional) – Comment to be added to the saved file names.
filename_separator (str, default="_") – Separator used in the saved file names.
create_subdirs (bool, default=False) – Whether to create label specific subdirectories inside output_dir or not.
pool_mode (bool, dafault=False) – Internal flag for integration with the pool classes, not to be set by the user.
Parameters common to all patch extraction classes¶
- class dplabtoolshiddenclass_5d2b0aee365a4b4383f4dc4540d9ace7
- Parameters:
patches (object) – Object representing one of the patch location classes.
num_workers (int) – Number of thread workers in parallel processing.
mp_chunksize (int, default=1) – Data chunk size used in parallel processing.
resampling_mode (str, optional) – One of two supported down/up-sampling methods:
wsiortileincluded_labels (list of str, optional) – Polygon labels included in patch extraction, all other labels will be ignored.
excluded_labels (list of str, optional) – Polygon labels excluded from patch extraction.