Annotations

dplabtools offers a set of classes capable of reading annotations created by the following applications: Sedeen, ASAP, and QuPath. Processed annotations are converted to a polygon-like format and integrate with other classes included in the package. They can also be saved as standardized JSON files for future reuse or further processing.

Important notes:

  • All non-polygon annotation objects (arrows, rulers, points, pure text labels, etc.) will be ignored during the annotation reading process.

  • Since annotation labels can be expressed by different entities (text, color, object property, etc.), classes for annotation reading introduce the concept of a helper function called get_label_fn. The purpose of this function is to provide the user with the necessary flexibility in regards to what constitutes a label.

Sedeen annotations

class dplabtools.slides.annotations.SedeenReader(...)

Class for reading Sedeen annotation files.

Basic usage

from dplabtools.slides.annotations import SedeenReader

xml_file = "/tmp/sedeen1.xml"

def get_label(**kwargs):
    return kwargs["color"]

reader = SedeenReader(data_file=xml_file, get_label_fn=get_label)
reader.save_json("sedeen1.json")

Property reader.polygons can be used for in-memory annotation processing.

Class details

Parameters, methods and properties specific to SedeenReader are inherited from the base class.

ASAP annotations

class dplabtools.slides.annotations.AsapReader(...)

Class for reading ASAP annotation files.

Basic usage

from dplabtools.slides.annotations import AsapReader

xml_file = "/tmp/asap1.xml"

def get_label(**kwargs):
    return kwargs["color"]

reader = AsapReader(data_file=xml_file, get_label_fn=get_label)
reader.save_json("asap1.json")

Property reader.polygons can be used for in-memory annotation processing.

Class details

Parameters, methods and properties specific to AsapReader are inherited from the base class.

Parameters, methods and properties of the base reader class

class dplabtools.slides.annotations.readers.base.BaseReader(...)

Create a base object for derived annotation readers classes.

Parameters:
  • data_file (str) – Raw annotation data file’s name or path.

  • get_label_fn (function) – Function returning a value to be used as the label.

save_json(json_file)

Save the annotations as a JSON file with serialized AnnotationPolygon objects.

Parameters:

json_file (str) – JSON file name or path.

property polygons

Return the annotations as a list of AnnotationPolygon objects.

QuPath annotations

Annotations saved by QuPath are not stored in an easily accessible format and it is necessary to read the whole QuPath project to extract them. Additionally, a working installation of QuPath is necessary for this task to complete.

QuPathProjectReader is a class dedicated to processing saved QuPath projects and extracting WSI annotations in bulk.

Basic usage

from dplabtools.slides.annotations import QuPathProjectReader

reader = QuPathProjectReader(qupath_install_dir="/opt/QuPath/", qupath_project_dir="/data/project1/")
reader.save_json("/tmp")

Output: the annotations for all WSIs present in the QuPath project will be saved as individual JSON files:

file1.svs.json
file2.svs.json
file3.svs.json

It is also possible to process them directly in memory by using the reader.project_data property.

Class details

class dplabtools.slides.annotations.QuPathProjectReader(...)

Create an object for reading QuPath annotations.

Parameters:
  • qupath_install_dir (str) – Directory where QuPath is installed.

  • qupath_project_dir (str) – Directory with a saved QuPath project.

save_json(save_dir)

Save the extracted annotations for all images in the project as JSON files.

Parameters:

save_dir (str) – Directory for saving JSON files.

property project_data

Return the project data as list of tuples (file_name, list of AnnotationPolygon objects).

Integration with patch locations classes

For the SedeenReader and AsapReader classes, extracted annotations can be passed to the classes calculating patches on polygon regions, either as a saved JSON file:

from dplabtools.slides.annotations import SedeenReader
from dplabtools.slides.patches import PolygonRegionGridPatches

wsi_file = "/tmp/wsi1.svs"
mask_file = "/tmp/wsi1_mask.png"
xml_file = "/tmp/sedeen1.xml"
json_file = "sedeen1.json"

def get_label(**kwargs):
    return kwargs["color"]

reader = SedeenReader(data_file=xml_file, get_label_fn=get_label)
reader.save_json(json_file)

grid_patches = PolygonRegionGridPatches(
    wsi_file=wsi_file,
    mask_data=mask_file,
    patch_size=500,
    patch_stride=1,
    polygon_data=json_file,
)

or as an in-memory object via the polygons property:

from dplabtools.slides.annotations import SedeenReader
from dplabtools.slides.patches import PolygonRegionGridPatches

wsi_file = "/tmp/wsi1.svs"
mask_file = "/tmp/wsi1_mask.png"
xml_file = "/tmp/sedeen1.xml"
json_file = "sedeen1.json"

def get_label(**kwargs):
    return kwargs["color"]

reader = SedeenReader(data_file=xml_file, get_label_fn=get_label)

grid_patches = PolygonRegionGridPatches(
    wsi_file=wsi_file,
    mask_data=mask_file,
    patch_size=500,
    patch_stride=1,
    polygon_data=reader.polygons,
)

In case of the QuPathProjectReader class, the property holding all the annotations in memory (reader.project_data) would have to be manipulated first to pass the annotations from a single WSI to the polygon_data argument.