Data Model¶
-
class
src.data_model.data_model.LeafSequence(folder_path=None, filename_pattern=None, file_list=None, creation_mode=False)[source]¶ A sequence of full size Leaf Images
-
extract_changed_leaves(output_path, dif_len=1, overwrite=False, shift_256=False, combination_function=<function 'subtract_modulo'>)[source]¶ Extracts and saves changed leaf images. This uses the filepath list created when the leaf sequence is instantiated.
- Parameters
output_path (
str) – where the differenced leaves should be saveddif_len (
int) – the step size between the leaves to be differencedoverwrite (
bool) – whether images that exist at the same file path should be overwrittenshift_256 (
bool) – whether images should be shifted by 256; this also means that images will saved as uint16combination_function – the combination function to be used; the default is to difference leaves
- Return type
None- Returns
None
-
load_extracted_images(load_image=False, disable_pb=False, shift_256=False, transform_uint8=False)[source]¶ Instantiates Leaf objects using the file_list attribute and appends these objects to the image_objects attribute.
- Parameters
load_image (
bool) – whether to load the image array belonging to Leaf being createddisable_pb (
bool) – whether the progress bar should be disabledshift_256 (
bool) – whether images should be shifted by 256; applies if load_image is truetransform_uint8 (
bool) – whether images transformed to a uint8 format; applies if load_image is true
- Return type
None- Returns
None
-
load_image_array(disable_pb=False, shift_256=False, transform_uint8=False)[source]¶ Loads all image arrays belonging to the Leaf objects in the sequence.
- Parameters
disable_pb (
bool) – whether the progress bar should be disabledshift_256 (
bool) – whether images should be shifted by 256transform_uint8 (
bool) – whether images transformed to a uint8 format
- Return type
None- Returns
None
-
load_tile_sequence(load_image=False, folder_path=None, filename_pattern=None, shift_256=False, transform_uint8=False)[source]¶ Loads all tile objects belonging to the Leaf objects in the sequence.
- Parameters
load_image (
bool) – whether the tile arrays should also be loadedfolder_path (
Optional[str]) – the folder path of the tilesfilename_pattern (
Optional[str]) – the filename pattern of the tilesshift_256 (
bool) – whether images should be shifted by 256; applies if load_image is truetransform_uint8 (
bool) – whether images transformed to a uint8 format; applies if load_image is true
- Return type
None- Returns
None
-
predict_leaf_sequence(model, x_tile_length=None, y_tile_length=None, memory_saving=True, overwrite=False, save_prediction=True, shift_256=False, transform_uint8=False, threshold=0.5, **kwargs)[source]¶ Predicts segmentation maps using the Leaves in the sequence. The model used should implement a predict tile method. If memory saving is set to false a prediction array is assigned to each Leaf object in the sequence.
- Parameters
model (
Model) – a model which inherits Model and hence implements a predict tile methodx_tile_length (
Optional[int]) – the x length of the tile used in the original trainingy_tile_length (
Optional[int]) – the y length of the tile used in the original trainingmemory_saving (
bool) – if set to True, both the image array and prediction array are set to None; this should only be set to true if the predictions are being savedoverwrite (
bool) – whether images that exist at the same file path should be overwrittensave_prediction (
bool) – whether the prediction should be savedshift_256 (
bool) – whether images should be shifted by 256transform_uint8 (
bool) – whether images transformed to a uint8 formatthreshold (
float) – the threshold to use when saving predictions; i.e. a pixel is saved as an embolism if p(embolism) > thresholdkwargs – kwargs for the predict tile function
- Return type
None- Returns
None
-
get_databunch_dataframe(embolism_only=False, csv_name=None)[source]¶ Extracts a databunch dataframe using the images in this sequence. The first field is the leaf path and the second field is the mask name. This is useful for Fastai. If a csv name is provided the DataFrame is saved.
- Parameters
embolism_only (
bool) – whether only leaves with embolisms should be usedcsv_name (
Optional[str]) – the name of the csv, which can also be a path; if this not provided, the DF will not be save
- Return type
Tuple[DataFrame,str]- Returns
DataBunch DF and sequence root folder path
-
get_tile_databunch_df(mseq, tile_embolism_only=False, leaf_embolism_only=False, csv_name=None)[source]¶ Extracts a combined databunch df using all tiles belonging to the Image objects in the sequence. The first field is the leaf tile path and the second field is the mask tile name. This is useful for Fastai. If a csv name is provided the DataFrame is saved.
- Parameters
mseq – a MaskSequence object
tile_embolism_only (
bool) – whether only tiles with embolisms should be usedleaf_embolism_only (
bool) – whether only leaves with embolisms should becsv_name (
Optional[str]) – the name of the csv, which can also be a path; if this not provided, the DF will not be save
- Return type
Tuple[DataFrame,List[str]]- Returns
combined DataBunch DF and list of image root folder path
-
-
class
src.data_model.data_model.MaskSequence(mpf_path=None, folder_path=None, filename_pattern=None, file_list=None, creation_mode=False)[source]¶ A sequence of full size Mask Images
-
extract_mask_from_multipage(output_path, overwrite=False, binarise=False)[source]¶ Extracts and saves mask images from a multipage file.
- Parameters
output_path (
str) – where the masks should be savedoverwrite (
bool) – whether images that exist at the same file path should be overwrittenbinarise (
bool) – whether the masks should be binarised; i.e 0 indicates no embolism and 1 indicates embolism
- Return type
None- Returns
None
-
load_extracted_images(load_image=False, disable_pb=False)[source]¶ Instantiates Mask objects using the file_list attribute and appends these objects to the image_objects attribute.
- Parameters
load_image (
bool) – whether to load the image array belonging to Mask being createddisable_pb (
bool) – whether the progress bar should be disabled
- Return type
None- Returns
None
-
load_image_array(disable_pb=False)[source]¶ Loads all image arrays belonging to the Leaf objects in the sequence.
- Parameters
disable_pb (
bool) – whether the progress bar should be disabled- Return type
None- Returns
None
-
load_tile_sequence(load_image=False, folder_path=None, filename_pattern=None)[source]¶ Loads all tile objects belonging to the Mask objects in the sequence.
- Parameters
load_image (
bool) – whether the tile arrays should also be loadedfolder_path (
Optional[str]) – the folder path of the tilesfilename_pattern (
Optional[str]) – the filename pattern of the tiles
- Return type
None- Returns
None
-
get_databunch_dataframe(embolism_only=False, csv_name=None)[source]¶ Extracts a databunch dataframe using the images in this sequence. The first field is the leaf path and the second field is the mask name. This is useful for Fastai. If a csv name is provided the DataFrame is saved.
- Parameters
embolism_only (
bool) – whether only leaves with embolisms should be usedcsv_name (
Optional[str]) – the name of the csv, which can also be a path; if this not provided, the DF will not be save
- Return type
Tuple[DataFrame,str]- Returns
DataBunch DF and sequence root folder path
-
get_tile_databunch_df(lseq, tile_embolism_only=False, leaf_embolism_only=False, csv_name=None)[source]¶ Extracts a combined databunch df using all tiles belonging to the Image objects in the sequence. The first field is the leaf tile path and the second field is the mask tile name. This is useful for Fastai. If a csv name is provided the DataFrame is saved.
- Parameters
mseq – a MaskSequence object
tile_embolism_only (
bool) – whether only tiles with embolisms should be usedleaf_embolism_only (
bool) – whether only leaves with embolisms should becsv_name (
Optional[str]) – the name of the csv, which can also be a path; if this not provided, the DF will not be save
- Return type
Tuple[DataFrame,List[str]]- Returns
combined DataBunch DF and list of image root folder path
-
-
class
src.data_model.data_model.Leaf(path=None, sequence_parent=None, parents=None, folder_path=None, filename_pattern=None, file_list=None)[source]¶ A full Leaf Image
-
extract_me(filepath, combination_function=<function 'subtract_modulo'>, shift_256=False, overwrite=False)[source]¶ Extracts and saves changed leaf images. The extracted image and file path are stored in the image_array and path attributes respectively
- Parameters
filepath (
~.) – the filepath to save the extracted imagecombination_function – the combination function to apply to images parents
shift_256 – whether the extracted image should be shifted by 256
overwrite (
bool) – whether an image that exist at the same file path should be overwritten
- Return type
None- Returns
None
-
load_extracted_images(load_image=False, disable_pb=False, shift_256=False, transform_uint8=False)[source]¶ Loads LeafTiles belonging to the Leaf.
- Parameters
load_image (
bool) – whether to load the image array belonging to LeafTile being createddisable_pb (
bool) – whether the progress bar should be disabledshift_256 (
bool) – whether images should be shifted by 256; applies if load_image is truetransform_uint8 (
bool) – whether images transformed to a uint8 format; applies if load_image is true
- Return type
None- Returns
None
-
tile_me(length_x, stride_x, length_y, stride_y, output_path=None, overwrite=False)[source]¶ Tiles an image and creates LeafTile objects. These are appended to the image_object attribute.
- Parameters
length_x (
int) – the x-length of the tilestride_x (
int) – the size of the x stridelength_y (
int) – the y-length of the tilestride_y (
int) – the size of the y strideoutput_path (
Optional[str]) – output path of where the tiles should be saved; if no path is provided, tiles are saved in a default locationoverwrite (
bool) – whether tiles that exist at the same file path should be overwritten
- Return type
None- Returns
None
-
predict_leaf(model, x_tile_length=None, y_tile_length=None, memory_saving=True, overwrite=False, save_prediction=True, shift_256=False, transform_uint8=False, threshold=0.5, **kwargs)[source]¶ Predict segmentation maps using the Leaf objects image_array. The model used should implement a predict tile method. If memory saving is set to false a prediction array is assigned to the Leaf object.
- Parameters
model – a model which inherits Model and hence implements a predict tile method
x_tile_length (
Optional[int]) – the x length of the tile used in the original trainingy_tile_length (
Optional[int]) – the y length of the tile used in the original trainingmemory_saving (
bool) – if set to True, both the image array and prediction array are set to None; this should only be set to true if the predictions are being savedoverwrite (
bool) – whether images that exist at the same file path should be overwrittensave_prediction (
bool) – whether the prediction should be savedshift_256 (
bool) – whether images should be shifted by 256transform_uint8 (
bool) – whether images transformed to a uint8 formatthreshold (
float) – the threshold to use when saving predictions; i.e. a pixel is saved as an embolism if p(embolism) > thresholdkwargs – kwargs for the predict tile function
- Return type
None- Returns
None
-
get_databunch_dataframe(embolism_only=False, csv_name=None)[source]¶ Extracts a databunch dataframe using the tiles in this Leaf. The first field is the leaf tile path and the second field is the mask tile name. This is useful for Fastai. If a csv name is provided the DataFrame is saved.
- Parameters
embolism_only (
bool) – whether only leaves with embolisms should be usedcsv_name (
Optional[str]) – the name of the csv, which can also be a path; if this not provided, the DF will not be save
- Return type
Tuple[DataFrame,str]- Returns
DataBunch DF and sequence root folder path
-
-
class
src.data_model.data_model.Mask(path=None, sequence_parent=None, folder_path=None, filename_pattern=None, file_list=None)[source]¶ A full Mask Image
-
create_mask(filepath, image, overwrite=False, binarise=False)[source]¶ Saves the PIL image at the provided file path. The image and file path are stored in the image_array and path attributes respectively.
- Parameters
filepath (
Union[Path,str]) – the filepath to save the extracted image (as a Path, or string)image – the mask image (as a PIL image)
overwrite (
bool) – whether an image that exist at the same file path should be overwrittenbinarise (
bool) – whether the mask should be binarised; this assumes that embolisms are indicated by a pixel intensity of 255
- Return type
None- Returns
None
-
load_extracted_images(load_image=False, disable_pb=False)[source]¶ Loads MaskTiles belonging to the Mask.
- Parameters
load_image (
bool) – whether to load the image array belonging to LeafTile being createddisable_pb (
bool) – whether the progress bar should be disabled
- Return type
None- Returns
None
-
tile_me(length_x, stride_x, length_y, stride_y, output_path=None, overwrite=False)[source]¶ Tiles an image and creates MaskTile objects. These are appended to the image_object attribute.
- Parameters
length_x (
int) – the x-length of the tilestride_x (
int) – the size of the x stridelength_y (
int) – the y-length of the tilestride_y (
int) – the size of the y strideoutput_path (
Optional[str]) – output path of where the tiles should be saved; if no path is provided, tiles are saved in a default locationoverwrite (
bool) – whether tiles that exist at the same file path should be overwritten
- Return type
None- Returns
None
-
get_databunch_dataframe(embolism_only=False, csv_name=None)[source]¶ Extracts a databunch dataframe using the tiles in this Mask. The first field is the leaf tile path and the second field is the mask tile name. This is useful for Fastai. If a csv name is provided the DataFrame is saved.
- Parameters
embolism_only (
bool) – whether only leaves with embolisms should be usedcsv_name (
Optional[str]) – the name of the csv, which can also be a path; if this not provided, the DF will not be save
- Return type
Tuple[DataFrame,str]- Returns
DataBunch DF and sequence root folder path
-
-
class
src.data_model.data_model.LeafTile(path=None, sequence_parent=None)[source]¶ A Leaf tile
-
predict_tile(model, memory_saving=True, **kwargs)[source]¶ Predicts and returns a segmentation map using the tile image.
- Parameters
model (
Model) – a model which inherits Model and hence implements a predict tile methodmemory_saving (
bool) – if set to True, the prediction array is not savedkwargs – kwargs for the predict tile function
- Return type
array- Returns
the prediction
-