API Reference

This section documents the public API of TRIDENT.

When to use the API vs CLI:

  • Use the CLI (run_batch_of_slides.py / trident batch) for standard reproducible runs.

  • Use the API when embedding Trident in your own Python pipeline, custom loops, or experiments.

  • Start with the CLI first, then move to API once the workflow is validated.

Minimal API usage

Load a slide and read regions

Use load_wsi as a context manager so file handles are released:

from trident import load_wsi

with load_wsi("./wsis/example.svs", lazy_init=False) as wsi:
    print(wsi.dimensions, wsi.mpp)
    patch = wsi.read_region((0, 0), level=0, size=(512, 512))

Run the pipeline with Processor

The CLI entrypoints are thin wrappers around Processor.

from trident import Processor
from trident.segmentation_models.load import segmentation_model_factory
from trident.patch_encoder_models.load import encoder_factory as patch_encoder_factory

processor = Processor(job_dir="./job", wsi_source="./wsis", search_nested=True, skip_errors=True)

seg = segmentation_model_factory("grandqc", confidence_thresh=0.5)
processor.run_segmentation_job(seg, device="cuda:0", batch_size=16)

processor.run_patching_job(target_magnification=20, patch_size=256, overlap=0, min_tissue_proportion=0.0)

enc = patch_encoder_factory("uni_v1")
processor.run_patch_feature_extraction_job(coords_dir="20x_256px_0px_overlap", patch_encoder=enc, device="cuda:0", batch_limit=64)

Outputs and run tracking

In job_dir (same as the CLI):

  • summary.md: appended once per run; compact counts + per-model breakdown + errors

  • runs/<run_id>.json: per-run manifest (args, timestamps, status)

  • wsi_states/<slide>__<hash>.json: per-slide state (attempts, outputs, resume info)

Notes for power users

  • Nested datasets: search_nested=True uses relative paths under wsi_source (mirrors CLI --search_nested).

  • Subset runs: pass custom_list_of_wsis="subset.csv"; the CSV must have a wsi column.

  • Reader selection: force a backend with reader_type="openslide" | "cucim" | "image" | "sdpc" | "omezarr" | "czi".

  • Slide encoders: slide embeddings require a specific underlying patch encoder. The mapping lives in trident.slide_encoder_models.load.slide_to_patch_encoder_name. If patch features are missing for that encoder, run_slide_feature_extraction_job extracts them on the fly.

  • Resume / idempotency: every job uses self-describing .lock files (PID, host, timestamp). If an output exists and is not actively locked, the job is skipped on re-run. Use trident.IO.clear_dead_locks(job_dir) (or pass --clear_dead_locks to the CLI) to remove orphaned locks safely.

  • Multi-GPU: the CLI handles GPU sharding via --gpus. From Python, run separate Processor instances per shard with disjoint selected_wsi_paths and distinct device="cuda:N" arguments to the run-* methods.

Trident

Core of TRIDENT with Processor and WSI building.

class trident.AnyToTiffConverter(job_dir: str, bigtiff: bool = False)

Bases: object

A class to convert images to TIFF format with options for resizing and pyramidal tiling.

job_dir

Directory to save converted images.

Type:

str

bigtiff

Flag to enable the creation of BigTIFF files.

Type:

bool

Initializes the Converter with a job directory and BigTIFF support.

Parameters:
  • job_dir (str) – The directory where converted images will be saved.

  • bigtiff (bool, optional) – Enable or disable BigTIFF file creation. Defaults to False.

__init__(job_dir: str, bigtiff: bool = False)

Initializes the Converter with a job directory and BigTIFF support.

Parameters:
  • job_dir (str) – The directory where converted images will be saved.

  • bigtiff (bool, optional) – Enable or disable BigTIFF file creation. Defaults to False.

process_all(input_dir: str, mpp_csv: str, downscale_by: int = 1, num_workers: int = 1) None

Process all eligible image files in a directory to convert them to pyramidal TIFF.

Parameters:
  • input_dir (str) – Directory containing image files to process.

  • mpp_csv (str) – Path to a CSV file with 2 fields: “wsi” (filenames with extensions) and “mpp” (microns per pixel).

  • downscale_by (int, optional) – Factor to downscale images by. For example, to save a 40x image into a 20x one, set downscale_by=2. Defaults to 1.

  • num_workers (int, optional) – Number of parallel workers. Use 1 for sequential mode. Defaults to 1.

process_file(input_file: str, mpp: float, zoom: float) None

Process a single image file to convert it into TIFF format.

Parameters:
  • input_file (str) – Path to the input image file.

  • mpp (float) – Microns per pixel value for the output image.

  • zoom (float) – Zoom factor for image resizing (e.g., 0.5 reduces the image by a factor of 2).

class trident.CZIWSI(slide_path: str, **kwargs: Any)

Bases: WSI

WSI implementation for reading Zeiss CZI slides using pylibCZIrw.

CZI slides may have a non-zero and even negative origin in the global coordinate system. TRIDENT’s WSI interface expects the top-left of the slide to be (0, 0), so this backend translates coordinates by the slide’s total_bounding_rectangle.

Initialize a CZIWSI instance.

Parameters:
  • slide_path (str) – Path to a .czi file.

  • **kwargs (dict) – Keyword arguments forwarded to the base WSI class. Most important key is: - lazy_init (bool, default=True): Whether to defer loading WSI and metadata. - mpp (float, optional): If provided, overrides metadata-derived pixel size.

__init__(slide_path: str, **kwargs: Any) None

Initialize a CZIWSI instance.

Parameters:
  • slide_path (str) – Path to a .czi file.

  • **kwargs (dict) – Keyword arguments forwarded to the base WSI class. Most important key is: - lazy_init (bool, default=True): Whether to defer loading WSI and metadata. - mpp (float, optional): If provided, overrides metadata-derived pixel size.

create_patcher(patch_size: int, src_pixel_size: float | None = None, dst_pixel_size: float | None = None, src_mag: int | None = None, dst_mag: int | None = None, overlap: int = 0, mask: GeoDataFrame | None = None, coords_only: bool = False, custom_coords: ndarray | None = None, threshold: float = 0.15, pil: bool = False) WSIPatcher

Create a patcher object for extracting patches from the WSI.

Parameters:
  • patch_size (int) – Size of each patch in pixels.

  • src_pixel_size (float, optional) – Source pixel size. Defaults to None.

  • dst_pixel_size (float, optional) – Destination pixel size. Defaults to None.

  • src_mag (int, optional) – Source magnification. Defaults to None.

  • dst_mag (int, optional) – Destination magnification. Defaults to None.

  • overlap (int, optional) – Overlap between patches in pixels. Defaults to 0.

  • mask (Optional[gpd.GeoDataFrame]) – Mask for patching. Defaults to None.

  • coords_only (bool, optional) – Whether to only return coordinates. Defaults to False.

  • custom_coords (Optional[np.ndarray]) – Custom coordinates to use. Defaults to None.

  • threshold (float, optional) – Threshold for tissue detection. Defaults to 0.15.

  • pil (bool, optional) – Whether to use PIL for image reading. Defaults to False.

Returns:

An object for extracting patches.

Return type:

WSIPatcher

Example

>>> patcher = wsi.create_patcher(patch_size=512, src_pixel_size=0.25, dst_pixel_size=0.5)
>>> for patch in patcher:
...     process(patch)
dump_patches(coords_path: str, save_patches_dir: str, max_patches: int = 0, image_format: str = 'png', jpeg_quality: int = 90) str

Dump patch images to disk for debugging/inspection.

This reads a Trident coords H5 file (or legacy coords if needed), iterates the corresponding patches, and writes them under save_patches_dir/<slide_name>/.

Parameters:
  • coords_path (str) – Path to a coords .h5 file produced by TRIDENT.

  • save_patches_dir (str) – Output directory to store patch images.

  • max_patches (int, optional) – If > 0, cap the number of patches written. Defaults to 0 (no cap).

  • image_format ({"png", "jpg"}, optional) – Image format to write. Defaults to “png”.

  • jpeg_quality (int, optional) – JPEG quality (1-100). Only used when image_format=”jpg”. Defaults to 90.

Returns:

Directory where patches were written.

Return type:

str

extract_patch_features(patch_encoder: Module, coords_path: str, save_features: str, device: str = 'cuda:0', saveas: str = 'h5', batch_limit: int = 512, verbose: bool = False) str

Extract feature embeddings from the WSI using a specified patch encoder.

Parameters:
  • patch_encoder (torch.nn.Module) – The model used for feature extraction.

  • coords_path (str) – Path to the file containing patch coordinates.

  • save_features (str) – Directory path to save the extracted features.

  • device (str, optional) – Device to run feature extraction on (e.g., ‘cuda:0’). Defaults to ‘cuda:0’.

  • saveas (str, optional) – Format to save the features (‘h5’ or ‘pt’). Defaults to ‘h5’.

  • batch_limit (int, optional) – Maximum batch size for feature extraction. Defaults to 512.

  • verbose (bool, optional) – Whether to print patch embedding progress. Defaults to False.

Returns:

The absolute file path to the saved feature file in the specified format.

Return type:

str

Example

>>> features_path = wsi.extract_features(patch_encoder, "output_coords/sample_name_patches.h5", "output_features")
>>> print(features_path)
output_features/sample_name.h5
extract_slide_features(patch_features_path: str, slide_encoder: Module, save_features: str, device: str = 'cuda') str

Extract slide-level features by encoding patch-level features using a pretrained slide encoder.

This function processes patch-level features extracted from a whole-slide image (WSI) and generates a single feature vector representing the entire slide. The extracted features are saved to a specified directory in HDF5 format.

Parameters:
  • patch_features_path (str) – Path to the HDF5 file containing patch-level features and coordinates.

  • slide_encoder (torch.nn.Module) – Pretrained slide encoder model for generating slide-level features.

  • save_features (str) – Directory where the extracted slide features will be saved.

  • device (str, optional) – Device to run computations on (e.g., ‘cuda’, ‘cpu’). Defaults to ‘cuda’.

Returns:

The absolute path to the slide-level features.

Return type:

str

Workflow:
  1. Load the pretrained slide encoder model and set it to evaluation mode.

  2. Load patch-level features and corresponding coordinates from the provided HDF5 file.

  3. Convert patch-level features into a tensor and move it to the specified device.

  4. Generate slide-level features using the slide encoder, with automatic mixed precision if supported.

  5. Save the slide-level features and associated metadata (e.g., coordinates) in an HDF5 file.

  6. Return the path to the saved slide features.

Raises:
  • FileNotFoundError – If the patch_features_path does not exist.

  • RuntimeError – If there is an issue with the slide encoder or tensor operations.

Example

>>> slide_features = extract_slide_features(
...     patch_features_path='path/to/patch_features.h5',
...     slide_encoder=pretrained_model,
...     save_features='output/slide_features',
...     device='cuda'
... )
>>> print(slide_features.shape)  # Outputs the shape of the slide-level feature vector.
extract_tissue_coords(target_mag: int, patch_size: int, save_coords: str, overlap: int = 0, min_tissue_proportion: float = 0.0) str

Extract patch coordinates from tissue regions in the WSI. It generates coordinates of patches at the specified magnification and saves the results in an HDF5 file.

Parameters:
  • target_mag (int) – Target magnification level for the patches.

  • patch_size (int) – Size of each patch at the target magnification.

  • save_coords (str) – Directory path to save the extracted coordinates.

  • overlap (int, optional) – Overlap between patches in pixels. Defaults to 0.

  • min_tissue_proportion (float, optional) – Minimum proportion of the patch under tissue to be kept. Defaults to 0.

Returns:

The absolute file path to the saved HDF5 file containing the patch coordinates.

Return type:

str

Example

>>> coords_path = wsi.extract_tissue_coords(20, 256, "output_coords", overlap=32)
>>> print(coords_path)
output_coords/patches/sample_name_patches.h5
get_best_level_and_custom_downsample(downsample: float, tolerance: float = 0.01) Tuple[int, float]

Determine the best level and custom downsample factor to approximate a desired downsample value.

Parameters:
  • downsample (float) – The desired downsample factor.

  • tolerance (float, optional) – Tolerance for rounding differences. Defaults to 0.01.

Returns:

The closest resolution level and the custom downsample factor.

Return type:

Tuple[int, float]

Raises:

ValueError – If no suitable resolution level is found for the specified downsample factor.

Example

>>> level, custom_downsample = wsi.get_best_level_and_custom_downsample(2.5)
>>> print(level, custom_downsample)
2, 1.1
get_dimensions() Tuple[int, int]

Return the dimensions (width, height) of the WSI.

get_thumbnail(size: tuple[int, int]) Image

Generate a thumbnail of the slide.

Parameters:

size (tuple[int, int]) – Desired (width, height) of the thumbnail.

Returns:

RGB thumbnail as a PIL Image.

Return type:

PIL.Image.Image

read_region(location: Tuple[int, int], level: int, size: Tuple[int, int], read_as: Literal['pil', 'numpy'] = 'pil') Image | ndarray

Extract a region from the CZI slide.

Parameters:
  • location (Tuple[int, int]) – (x, y) coordinates in TRIDENT level-0 coordinate system (top-left is (0, 0)).

  • level (int) – Pyramid level to read from.

  • size (Tuple[int, int]) – (width, height) of the region to extract at the requested level.

  • read_as ({'pil', 'numpy'}, optional) – Output format.

Returns:

Extracted image region in the specified format.

Return type:

Union[PIL.Image.Image, np.ndarray]

release() None

Close the underlying CZI reader and release resources.

segment_semantic(segmentation_model: SegmentationModel, target_mag: int = 10, batch_size: int = 16, device: str = 'cuda:0', verbose=False, num_workers=None, collate_fn=None, inference_fn=None, return_contours=False) Tuple[ndarray, float] | Tuple[ndarray, float, GeoDataFrame]

Segment semantic regions in the WSI using a specified segmentation model.

Parameters:
  • segmentation_model (SegmentationModel) – The model used for tissue segmentation.

  • target_mag (int, optional) – Target magnification level for segmentation. Defaults to 10.

  • batch_size (int, optional) – Batch size for processing patches. Defaults to 16.

  • device (str) – The computation device to use (e.g., ‘cuda:0’ for GPU or ‘cpu’ for CPU).

  • verbose (bool, optional) – Whether to print segmentation progress. Defaults to False.

  • num_workers (Optional[int], optional) – Number of workers to use for the tile dataloader. If None, the number of workers is automatically inferred. Defaults to None.

  • collate_fn (optional) – Custom collate function used in the dataloader. It must return a dictionary containing at least xcoords and ycoords (level-0 coordinates), and img if inference_fn is not provided.

  • inference_fn (optional) – Function used during inference. Called as inference_fn(model, batch, device) where batch is the batch returned by collate_fn (if provided) or (img, (xcoords, ycoords)) otherwise. Must return a tensor with shape (B, H, W) and dtype uint8.

  • return_contours (bool, optional) – Whether to return the contours of each class in a GeoDataFrame. Defaults to False.

Returns:

A downscaled H x W np.ndarray containing class predictions and its downscale factor. If return_contours is True, also returns the contours of each class in a GeoDataFrame.

Return type:

Union[Tuple[np.ndarray, float], Tuple[np.ndarray, float, gpd.GeoDataFrame]]

Example

>>> wsi.segment_tissue(segmentation_model, target_mag=10, job_dir="output_dir")
>>> # Results saved in "output_dir"
segment_tissue(segmentation_model: SegmentationModel, target_mag: int = 10, holes_are_tissue: bool = True, job_dir: str | None = None, batch_size: int = 16, device: str = 'cuda:0', verbose=False, num_workers=None) str | GeoDataFrame

Segment tissue regions in the WSI using a specified segmentation model. It processes the WSI at a target magnification level, optionally treating holes in the mask as tissue. The segmented regions are saved as thumbnails and GeoJSON contours.

Parameters:
  • segmentation_model (SegmentationModel) – The model used for tissue segmentation.

  • target_mag (int, optional) – Target magnification level for segmentation. Defaults to 10.

  • holes_are_tissue (bool, optional) – Whether to treat holes in the mask as tissue. Defaults to True.

  • job_dir (Optional[str], optional) – Directory to save the segmentation results. If None, this method directly returns the contours as a GeoDataFrame without saving files. Defaults to None.

  • batch_size (int, optional) – Batch size for processing patches. Defaults to 16.

  • device (str) – The computation device to use (e.g., ‘cuda:0’ for GPU or ‘cpu’ for CPU).

  • verbose (bool, optional) – Whether to print segmentation progress. Defaults to False.

  • num_workers (Optional[int], optional) – Number of workers to use for the tile dataloader. If None, the number of workers is automatically inferred. Defaults to None.

Returns:

The absolute path to the GeoJSON if job_dir is not None; otherwise a GeoDataFrame.

Return type:

Union[str, gpd.GeoDataFrame]

Example

>>> wsi.segment_tissue(segmentation_model, target_mag=10, job_dir="output_dir")
>>> # Results saved in "output_dir"
visualize_coords(coords_path: str, save_patch_viz: str) str

Overlay patch coordinates onto a scaled thumbnail of the WSI.

Parameters:
  • coords_path (str) – Path to the file containing the patch coordinates.

  • save_patch_viz (str) – Directory path to save the visualization image.

Returns:

The file path to the saved visualization image.

Return type:

str

Example

>>> viz_path = wsi.visualize_coords("output_coords/sample_name_patches.h5", "output_viz")
>>> print(viz_path)
output_viz/sample_name.png
class trident.CuCIMWSI(slide_path: str, **kwargs: Any)

Bases: WSI

Initialize a WSI instance using CuCIM as a backend.

Parameters:
  • slide_path (str) – Path to the WSI file.

  • **kwargs (dict) – Keyword arguments forwarded to the base WSI class. Most important key is: - lazy_init (bool, default=True): Whether to defer loading WSI and metadata.

Please refer to WSI constructor for all parameters.

Example

>>> wsi = CuCIMWSI(slide_path="path/to/wsi.svs", lazy_init=False)
>>> print(wsi)
<width=100000, height=80000, backend=CuCIMWSI, mpp=0.25, mag=40>
__init__(slide_path: str, **kwargs: Any) None

Initialize a WSI instance using CuCIM as a backend.

Parameters:
  • slide_path (str) – Path to the WSI file.

  • **kwargs (dict) – Keyword arguments forwarded to the base WSI class. Most important key is: - lazy_init (bool, default=True): Whether to defer loading WSI and metadata.

Please refer to WSI constructor for all parameters.

Example

>>> wsi = CuCIMWSI(slide_path="path/to/wsi.svs", lazy_init=False)
>>> print(wsi)
<width=100000, height=80000, backend=CuCIMWSI, mpp=0.25, mag=40>
close()
create_patcher(patch_size: int, src_pixel_size: float | None = None, dst_pixel_size: float | None = None, src_mag: int | None = None, dst_mag: int | None = None, overlap: int = 0, mask: GeoDataFrame | None = None, coords_only: bool = False, custom_coords: ndarray | None = None, threshold: float = 0.15, pil: bool = False) WSIPatcher

Create a patcher object for extracting patches from the WSI.

Parameters:
  • patch_size (int) – Size of each patch in pixels.

  • src_pixel_size (float, optional) – Source pixel size. Defaults to None.

  • dst_pixel_size (float, optional) – Destination pixel size. Defaults to None.

  • src_mag (int, optional) – Source magnification. Defaults to None.

  • dst_mag (int, optional) – Destination magnification. Defaults to None.

  • overlap (int, optional) – Overlap between patches in pixels. Defaults to 0.

  • mask (Optional[gpd.GeoDataFrame]) – Mask for patching. Defaults to None.

  • coords_only (bool, optional) – Whether to only return coordinates. Defaults to False.

  • custom_coords (Optional[np.ndarray]) – Custom coordinates to use. Defaults to None.

  • threshold (float, optional) – Threshold for tissue detection. Defaults to 0.15.

  • pil (bool, optional) – Whether to use PIL for image reading. Defaults to False.

Returns:

An object for extracting patches.

Return type:

WSIPatcher

Example

>>> patcher = wsi.create_patcher(patch_size=512, src_pixel_size=0.25, dst_pixel_size=0.5)
>>> for patch in patcher:
...     process(patch)
dump_patches(coords_path: str, save_patches_dir: str, max_patches: int = 0, image_format: str = 'png', jpeg_quality: int = 90) str

Dump patch images to disk for debugging/inspection.

This reads a Trident coords H5 file (or legacy coords if needed), iterates the corresponding patches, and writes them under save_patches_dir/<slide_name>/.

Parameters:
  • coords_path (str) – Path to a coords .h5 file produced by TRIDENT.

  • save_patches_dir (str) – Output directory to store patch images.

  • max_patches (int, optional) – If > 0, cap the number of patches written. Defaults to 0 (no cap).

  • image_format ({"png", "jpg"}, optional) – Image format to write. Defaults to “png”.

  • jpeg_quality (int, optional) – JPEG quality (1-100). Only used when image_format=”jpg”. Defaults to 90.

Returns:

Directory where patches were written.

Return type:

str

extract_patch_features(**kwargs) str

Extract feature embeddings from the WSI using a specified patch encoder.

Parameters:
  • patch_encoder (torch.nn.Module) – The model used for feature extraction.

  • coords_path (str) – Path to the file containing patch coordinates.

  • save_features (str) – Directory path to save the extracted features.

  • device (str, optional) – Device to run feature extraction on (e.g., ‘cuda:0’). Defaults to ‘cuda:0’.

  • saveas (str, optional) – Format to save the features (‘h5’ or ‘pt’). Defaults to ‘h5’.

  • batch_limit (int, optional) – Maximum batch size for feature extraction. Defaults to 512.

  • verbose (bool, optional) – Whether to print patch embedding progress. Defaults to False.

Returns:

The absolute file path to the saved feature file in the specified format.

Return type:

str

Example

>>> features_path = wsi.extract_features(patch_encoder, "output_coords/sample_name_patches.h5", "output_features")
>>> print(features_path)
output_features/sample_name.h5
extract_slide_features(**kwargs) str

Extract slide-level features by encoding patch-level features using a pretrained slide encoder.

This function processes patch-level features extracted from a whole-slide image (WSI) and generates a single feature vector representing the entire slide. The extracted features are saved to a specified directory in HDF5 format.

Parameters:
  • patch_features_path (str) – Path to the HDF5 file containing patch-level features and coordinates.

  • slide_encoder (torch.nn.Module) – Pretrained slide encoder model for generating slide-level features.

  • save_features (str) – Directory where the extracted slide features will be saved.

  • device (str, optional) – Device to run computations on (e.g., ‘cuda’, ‘cpu’). Defaults to ‘cuda’.

Returns:

The absolute path to the slide-level features.

Return type:

str

Workflow:
  1. Load the pretrained slide encoder model and set it to evaluation mode.

  2. Load patch-level features and corresponding coordinates from the provided HDF5 file.

  3. Convert patch-level features into a tensor and move it to the specified device.

  4. Generate slide-level features using the slide encoder, with automatic mixed precision if supported.

  5. Save the slide-level features and associated metadata (e.g., coordinates) in an HDF5 file.

  6. Return the path to the saved slide features.

Raises:
  • FileNotFoundError – If the patch_features_path does not exist.

  • RuntimeError – If there is an issue with the slide encoder or tensor operations.

Example

>>> slide_features = extract_slide_features(
...     patch_features_path='path/to/patch_features.h5',
...     slide_encoder=pretrained_model,
...     save_features='output/slide_features',
...     device='cuda'
... )
>>> print(slide_features.shape)  # Outputs the shape of the slide-level feature vector.
extract_tissue_coords(**kwargs) str

Extract patch coordinates from tissue regions in the WSI. It generates coordinates of patches at the specified magnification and saves the results in an HDF5 file.

Parameters:
  • target_mag (int) – Target magnification level for the patches.

  • patch_size (int) – Size of each patch at the target magnification.

  • save_coords (str) – Directory path to save the extracted coordinates.

  • overlap (int, optional) – Overlap between patches in pixels. Defaults to 0.

  • min_tissue_proportion (float, optional) – Minimum proportion of the patch under tissue to be kept. Defaults to 0.

Returns:

The absolute file path to the saved HDF5 file containing the patch coordinates.

Return type:

str

Example

>>> coords_path = wsi.extract_tissue_coords(20, 256, "output_coords", overlap=32)
>>> print(coords_path)
output_coords/patches/sample_name_patches.h5
get_best_level_and_custom_downsample(downsample: float, tolerance: float = 0.01) Tuple[int, float]

Determine the best level and custom downsample factor to approximate a desired downsample value.

Parameters:
  • downsample (float) – The desired downsample factor.

  • tolerance (float, optional) – Tolerance for rounding differences. Defaults to 0.01.

Returns:

The closest resolution level and the custom downsample factor.

Return type:

Tuple[int, float]

Raises:

ValueError – If no suitable resolution level is found for the specified downsample factor.

Example

>>> level, custom_downsample = wsi.get_best_level_and_custom_downsample(2.5)
>>> print(level, custom_downsample)
2, 1.1
get_dimensions() Tuple[int, int]

Return the (width, height) dimensions of the CuCIM-managed WSI.

Returns:

A tuple containing the width and height of the WSI in pixels.

Return type:

Tuple[int, int]

Example

>>> wsi.get_dimensions()
(100000, 80000)
get_thumbnail(size: tuple[int, int]) Image

Generate a thumbnail image of the WSI.

Parameters:

size (tuple[int, int]) – A tuple specifying the desired width and height of the thumbnail.

Returns:

The thumbnail as a PIL Image in RGB format.

Return type:

Image.Image

read_region(location: Tuple[int, int], level: int, size: Tuple[int, int], read_as: Literal['pil', 'numpy'] = 'pil') Image | ndarray

Extract a specific region from the whole-slide image (WSI) using CuCIM.

Parameters:
  • location (Tuple[int, int]) – (x, y) coordinates of the top-left corner of the region to extract.

  • level (int) – Pyramid level to read from.

  • size (Tuple[int, int]) – (width, height) of the region to extract.

  • read_as ({'pil', 'numpy'}, optional) – Output format for the region: - ‘pil’: returns a PIL Image (default) - ‘numpy’: returns a NumPy array (H, W, 3)

Returns:

The extracted region in the specified format.

Return type:

Union[PIL.Image.Image, np.ndarray]

Raises:

ValueError – If read_as is not one of the supported options.

Example

>>> region = wsi.read_region((1000, 1000), level=0, size=(512, 512), read_as='pil')
>>> region.show()
release() None

Release internal data (CPU/GPU/memory) and clear heavy references in the WSI instance. Call this method after you’re done processing to avoid memory/GPU leaks.

segment_semantic(segmentation_model: SegmentationModel, target_mag: int = 10, batch_size: int = 16, device: str = 'cuda:0', verbose=False, num_workers=None, collate_fn=None, inference_fn=None, return_contours=False) Tuple[ndarray, float] | Tuple[ndarray, float, GeoDataFrame]

Segment semantic regions in the WSI using a specified segmentation model.

Parameters:
  • segmentation_model (SegmentationModel) – The model used for tissue segmentation.

  • target_mag (int, optional) – Target magnification level for segmentation. Defaults to 10.

  • batch_size (int, optional) – Batch size for processing patches. Defaults to 16.

  • device (str) – The computation device to use (e.g., ‘cuda:0’ for GPU or ‘cpu’ for CPU).

  • verbose (bool, optional) – Whether to print segmentation progress. Defaults to False.

  • num_workers (Optional[int], optional) – Number of workers to use for the tile dataloader. If None, the number of workers is automatically inferred. Defaults to None.

  • collate_fn (optional) – Custom collate function used in the dataloader. It must return a dictionary containing at least xcoords and ycoords (level-0 coordinates), and img if inference_fn is not provided.

  • inference_fn (optional) – Function used during inference. Called as inference_fn(model, batch, device) where batch is the batch returned by collate_fn (if provided) or (img, (xcoords, ycoords)) otherwise. Must return a tensor with shape (B, H, W) and dtype uint8.

  • return_contours (bool, optional) – Whether to return the contours of each class in a GeoDataFrame. Defaults to False.

Returns:

A downscaled H x W np.ndarray containing class predictions and its downscale factor. If return_contours is True, also returns the contours of each class in a GeoDataFrame.

Return type:

Union[Tuple[np.ndarray, float], Tuple[np.ndarray, float, gpd.GeoDataFrame]]

Example

>>> wsi.segment_tissue(segmentation_model, target_mag=10, job_dir="output_dir")
>>> # Results saved in "output_dir"
segment_tissue(**kwargs) str

Segment tissue regions in the WSI using a specified segmentation model. It processes the WSI at a target magnification level, optionally treating holes in the mask as tissue. The segmented regions are saved as thumbnails and GeoJSON contours.

Parameters:
  • segmentation_model (SegmentationModel) – The model used for tissue segmentation.

  • target_mag (int, optional) – Target magnification level for segmentation. Defaults to 10.

  • holes_are_tissue (bool, optional) – Whether to treat holes in the mask as tissue. Defaults to True.

  • job_dir (Optional[str], optional) – Directory to save the segmentation results. If None, this method directly returns the contours as a GeoDataFrame without saving files. Defaults to None.

  • batch_size (int, optional) – Batch size for processing patches. Defaults to 16.

  • device (str) – The computation device to use (e.g., ‘cuda:0’ for GPU or ‘cpu’ for CPU).

  • verbose (bool, optional) – Whether to print segmentation progress. Defaults to False.

  • num_workers (Optional[int], optional) – Number of workers to use for the tile dataloader. If None, the number of workers is automatically inferred. Defaults to None.

Returns:

The absolute path to the GeoJSON if job_dir is not None; otherwise a GeoDataFrame.

Return type:

Union[str, gpd.GeoDataFrame]

Example

>>> wsi.segment_tissue(segmentation_model, target_mag=10, job_dir="output_dir")
>>> # Results saved in "output_dir"
visualize_coords(**kwargs) str

Overlay patch coordinates onto a scaled thumbnail of the WSI.

Parameters:
  • coords_path (str) – Path to the file containing the patch coordinates.

  • save_patch_viz (str) – Directory path to save the visualization image.

Returns:

The file path to the saved visualization image.

Return type:

str

Example

>>> viz_path = wsi.visualize_coords("output_coords/sample_name_patches.h5", "output_viz")
>>> print(viz_path)
output_viz/sample_name.png
class trident.ImageWSI(slide_path, **kwargs)

Bases: WSI

Initialize a WSI object from a standard image file (e.g., PNG, JPEG, etc.).

Parameters:
  • slide_path (str) – Path to the image file.

  • mpp (float) – Microns per pixel. Required since standard image formats do not store this metadata.

  • name (str, optional) – Optional name for the slide.

  • lazy_init (bool, default=True) – Whether to defer initialization until the WSI is accessed.

Raises:

ValueError – If the required ‘mpp’ argument is not provided.

Example

>>> wsi = ImageWSI("path/to/image.png", lazy_init=False, mpp=0.51)
>>> print(wsi)
<width=5120, height=3840, backend=ImageWSI, mpp=0.51, mag=20>
__init__(slide_path, **kwargs) None

Initialize a WSI object from a standard image file (e.g., PNG, JPEG, etc.).

Parameters:
  • slide_path (str) – Path to the image file.

  • mpp (float) – Microns per pixel. Required since standard image formats do not store this metadata.

  • name (str, optional) – Optional name for the slide.

  • lazy_init (bool, default=True) – Whether to defer initialization until the WSI is accessed.

Raises:

ValueError – If the required ‘mpp’ argument is not provided.

Example

>>> wsi = ImageWSI("path/to/image.png", lazy_init=False, mpp=0.51)
>>> print(wsi)
<width=5120, height=3840, backend=ImageWSI, mpp=0.51, mag=20>
close()

Close the internal image object to free memory. These can take several GB in RAM.

create_patcher(patch_size: int, src_pixel_size: float | None = None, dst_pixel_size: float | None = None, src_mag: int | None = None, dst_mag: int | None = None, overlap: int = 0, mask: GeoDataFrame | None = None, coords_only: bool = False, custom_coords: ndarray | None = None, threshold: float = 0.15, pil: bool = False) WSIPatcher

Create a patcher object for extracting patches from the WSI.

Parameters:
  • patch_size (int) – Size of each patch in pixels.

  • src_pixel_size (float, optional) – Source pixel size. Defaults to None.

  • dst_pixel_size (float, optional) – Destination pixel size. Defaults to None.

  • src_mag (int, optional) – Source magnification. Defaults to None.

  • dst_mag (int, optional) – Destination magnification. Defaults to None.

  • overlap (int, optional) – Overlap between patches in pixels. Defaults to 0.

  • mask (Optional[gpd.GeoDataFrame]) – Mask for patching. Defaults to None.

  • coords_only (bool, optional) – Whether to only return coordinates. Defaults to False.

  • custom_coords (Optional[np.ndarray]) – Custom coordinates to use. Defaults to None.

  • threshold (float, optional) – Threshold for tissue detection. Defaults to 0.15.

  • pil (bool, optional) – Whether to use PIL for image reading. Defaults to False.

Returns:

An object for extracting patches.

Return type:

WSIPatcher

Example

>>> patcher = wsi.create_patcher(patch_size=512, src_pixel_size=0.25, dst_pixel_size=0.5)
>>> for patch in patcher:
...     process(patch)
dump_patches(coords_path: str, save_patches_dir: str, max_patches: int = 0, image_format: str = 'png', jpeg_quality: int = 90) str

Dump patch images to disk for debugging/inspection.

This reads a Trident coords H5 file (or legacy coords if needed), iterates the corresponding patches, and writes them under save_patches_dir/<slide_name>/.

Parameters:
  • coords_path (str) – Path to a coords .h5 file produced by TRIDENT.

  • save_patches_dir (str) – Output directory to store patch images.

  • max_patches (int, optional) – If > 0, cap the number of patches written. Defaults to 0 (no cap).

  • image_format ({"png", "jpg"}, optional) – Image format to write. Defaults to “png”.

  • jpeg_quality (int, optional) – JPEG quality (1-100). Only used when image_format=”jpg”. Defaults to 90.

Returns:

Directory where patches were written.

Return type:

str

extract_patch_features(*args, **kwargs)

Extract feature embeddings from the WSI using a specified patch encoder.

Parameters:
  • patch_encoder (torch.nn.Module) – The model used for feature extraction.

  • coords_path (str) – Path to the file containing patch coordinates.

  • save_features (str) – Directory path to save the extracted features.

  • device (str, optional) – Device to run feature extraction on (e.g., ‘cuda:0’). Defaults to ‘cuda:0’.

  • saveas (str, optional) – Format to save the features (‘h5’ or ‘pt’). Defaults to ‘h5’.

  • batch_limit (int, optional) – Maximum batch size for feature extraction. Defaults to 512.

  • verbose (bool, optional) – Whether to print patch embedding progress. Defaults to False.

Returns:

The absolute file path to the saved feature file in the specified format.

Return type:

str

Example

>>> features_path = wsi.extract_features(patch_encoder, "output_coords/sample_name_patches.h5", "output_features")
>>> print(features_path)
output_features/sample_name.h5
extract_slide_features(*args, **kwargs)

Extract slide-level features by encoding patch-level features using a pretrained slide encoder.

This function processes patch-level features extracted from a whole-slide image (WSI) and generates a single feature vector representing the entire slide. The extracted features are saved to a specified directory in HDF5 format.

Parameters:
  • patch_features_path (str) – Path to the HDF5 file containing patch-level features and coordinates.

  • slide_encoder (torch.nn.Module) – Pretrained slide encoder model for generating slide-level features.

  • save_features (str) – Directory where the extracted slide features will be saved.

  • device (str, optional) – Device to run computations on (e.g., ‘cuda’, ‘cpu’). Defaults to ‘cuda’.

Returns:

The absolute path to the slide-level features.

Return type:

str

Workflow:
  1. Load the pretrained slide encoder model and set it to evaluation mode.

  2. Load patch-level features and corresponding coordinates from the provided HDF5 file.

  3. Convert patch-level features into a tensor and move it to the specified device.

  4. Generate slide-level features using the slide encoder, with automatic mixed precision if supported.

  5. Save the slide-level features and associated metadata (e.g., coordinates) in an HDF5 file.

  6. Return the path to the saved slide features.

Raises:
  • FileNotFoundError – If the patch_features_path does not exist.

  • RuntimeError – If there is an issue with the slide encoder or tensor operations.

Example

>>> slide_features = extract_slide_features(
...     patch_features_path='path/to/patch_features.h5',
...     slide_encoder=pretrained_model,
...     save_features='output/slide_features',
...     device='cuda'
... )
>>> print(slide_features.shape)  # Outputs the shape of the slide-level feature vector.
extract_tissue_coords(*args, **kwargs)

Extract patch coordinates from tissue regions in the WSI. It generates coordinates of patches at the specified magnification and saves the results in an HDF5 file.

Parameters:
  • target_mag (int) – Target magnification level for the patches.

  • patch_size (int) – Size of each patch at the target magnification.

  • save_coords (str) – Directory path to save the extracted coordinates.

  • overlap (int, optional) – Overlap between patches in pixels. Defaults to 0.

  • min_tissue_proportion (float, optional) – Minimum proportion of the patch under tissue to be kept. Defaults to 0.

Returns:

The absolute file path to the saved HDF5 file containing the patch coordinates.

Return type:

str

Example

>>> coords_path = wsi.extract_tissue_coords(20, 256, "output_coords", overlap=32)
>>> print(coords_path)
output_coords/patches/sample_name_patches.h5
get_best_level_and_custom_downsample(downsample: float, tolerance: float = 0.01) Tuple[int, float]

Determine the best level and custom downsample factor to approximate a desired downsample value.

Parameters:
  • downsample (float) – The desired downsample factor.

  • tolerance (float, optional) – Tolerance for rounding differences. Defaults to 0.01.

Returns:

The closest resolution level and the custom downsample factor.

Return type:

Tuple[int, float]

Raises:

ValueError – If no suitable resolution level is found for the specified downsample factor.

Example

>>> level, custom_downsample = wsi.get_best_level_and_custom_downsample(2.5)
>>> print(level, custom_downsample)
2, 1.1
get_dimensions()
get_thumbnail(size)

Generate a thumbnail of the image.

Parameters:

size (tuple[int, int]) – Desired thumbnail size (width, height).

Returns:

RGB thumbnail image.

Return type:

PIL.Image.Image

read_region(location: Tuple[int, int], level: int, size: Tuple[int, int], read_as: Literal['pil', 'numpy'] = 'pil') Image | ndarray

Extract a specific region from a single-resolution image (e.g., JPEG, PNG, TIFF).

Parameters:
  • location (Tuple[int, int]) – (x, y) coordinates of the top-left corner of the region to extract.

  • level (int) – Pyramid level to read from. Only level 0 is supported for non-pyramidal images.

  • size (Tuple[int, int]) – (width, height) of the region to extract.

  • read_as ({'pil', 'numpy'}, optional) – Output format for the region: - ‘pil’: returns a PIL Image (default) - ‘numpy’: returns a NumPy array (H, W, 3)

Returns:

Extracted image region in the specified format.

Return type:

Union[PIL.Image.Image, np.ndarray]

Raises:

ValueError – If level is not 0 or if read_as is not one of the supported options.

Example

>>> region = wsi.read_region((0, 0), level=0, size=(512, 512), read_as='numpy')
>>> print(region.shape)
(512, 512, 3)
release() None

Release internal data (CPU/GPU/memory) and clear heavy references in the WSI instance. Call this method after you’re done processing to avoid memory/GPU leaks.

segment_semantic(segmentation_model: SegmentationModel, target_mag: int = 10, batch_size: int = 16, device: str = 'cuda:0', verbose=False, num_workers=None, collate_fn=None, inference_fn=None, return_contours=False) Tuple[ndarray, float] | Tuple[ndarray, float, GeoDataFrame]

Segment semantic regions in the WSI using a specified segmentation model.

Parameters:
  • segmentation_model (SegmentationModel) – The model used for tissue segmentation.

  • target_mag (int, optional) – Target magnification level for segmentation. Defaults to 10.

  • batch_size (int, optional) – Batch size for processing patches. Defaults to 16.

  • device (str) – The computation device to use (e.g., ‘cuda:0’ for GPU or ‘cpu’ for CPU).

  • verbose (bool, optional) – Whether to print segmentation progress. Defaults to False.

  • num_workers (Optional[int], optional) – Number of workers to use for the tile dataloader. If None, the number of workers is automatically inferred. Defaults to None.

  • collate_fn (optional) – Custom collate function used in the dataloader. It must return a dictionary containing at least xcoords and ycoords (level-0 coordinates), and img if inference_fn is not provided.

  • inference_fn (optional) – Function used during inference. Called as inference_fn(model, batch, device) where batch is the batch returned by collate_fn (if provided) or (img, (xcoords, ycoords)) otherwise. Must return a tensor with shape (B, H, W) and dtype uint8.

  • return_contours (bool, optional) – Whether to return the contours of each class in a GeoDataFrame. Defaults to False.

Returns:

A downscaled H x W np.ndarray containing class predictions and its downscale factor. If return_contours is True, also returns the contours of each class in a GeoDataFrame.

Return type:

Union[Tuple[np.ndarray, float], Tuple[np.ndarray, float, gpd.GeoDataFrame]]

Example

>>> wsi.segment_tissue(segmentation_model, target_mag=10, job_dir="output_dir")
>>> # Results saved in "output_dir"
segment_tissue(*args, **kwargs)

Segment tissue regions in the WSI using a specified segmentation model. It processes the WSI at a target magnification level, optionally treating holes in the mask as tissue. The segmented regions are saved as thumbnails and GeoJSON contours.

Parameters:
  • segmentation_model (SegmentationModel) – The model used for tissue segmentation.

  • target_mag (int, optional) – Target magnification level for segmentation. Defaults to 10.

  • holes_are_tissue (bool, optional) – Whether to treat holes in the mask as tissue. Defaults to True.

  • job_dir (Optional[str], optional) – Directory to save the segmentation results. If None, this method directly returns the contours as a GeoDataFrame without saving files. Defaults to None.

  • batch_size (int, optional) – Batch size for processing patches. Defaults to 16.

  • device (str) – The computation device to use (e.g., ‘cuda:0’ for GPU or ‘cpu’ for CPU).

  • verbose (bool, optional) – Whether to print segmentation progress. Defaults to False.

  • num_workers (Optional[int], optional) – Number of workers to use for the tile dataloader. If None, the number of workers is automatically inferred. Defaults to None.

Returns:

The absolute path to the GeoJSON if job_dir is not None; otherwise a GeoDataFrame.

Return type:

Union[str, gpd.GeoDataFrame]

Example

>>> wsi.segment_tissue(segmentation_model, target_mag=10, job_dir="output_dir")
>>> # Results saved in "output_dir"
visualize_coords(*args, **kwargs)

Overlay patch coordinates onto a scaled thumbnail of the WSI.

Parameters:
  • coords_path (str) – Path to the file containing the patch coordinates.

  • save_patch_viz (str) – Directory path to save the visualization image.

Returns:

The file path to the saved visualization image.

Return type:

str

Example

>>> viz_path = wsi.visualize_coords("output_coords/sample_name_patches.h5", "output_viz")
>>> print(viz_path)
output_viz/sample_name.png
class trident.OMEZarrWSI(slide_path: str, **kwargs: Any)

Bases: WSI

WSI implementation for reading zarrfiles following the OME specification.

Initialize a OMEZarr instance for OME-Zarr whole-slide images.

Parameters:
  • slide_path (str) – Path to an .zarr OME multiscale file.

  • **kwargs (dict) – Additional keyword arguments forwarded to the base WSI class. - lazy_init (bool, default=True): Whether to defer loading WSI and metadata.

Example

>>> wsi = OMEZarrWSI(slide_path="path/to/wsi", lazy_init=False)
>>> print(wsi)
<width=100000, height=80000, backend=OMEZarrWSI, mpp=0.25, mag=40>
__init__(slide_path: str, **kwargs: Any) None

Initialize a OMEZarr instance for OME-Zarr whole-slide images.

Parameters:
  • slide_path (str) – Path to an .zarr OME multiscale file.

  • **kwargs (dict) – Additional keyword arguments forwarded to the base WSI class. - lazy_init (bool, default=True): Whether to defer loading WSI and metadata.

Example

>>> wsi = OMEZarrWSI(slide_path="path/to/wsi", lazy_init=False)
>>> print(wsi)
<width=100000, height=80000, backend=OMEZarrWSI, mpp=0.25, mag=40>
create_patcher(patch_size: int, src_pixel_size: float | None = None, dst_pixel_size: float | None = None, src_mag: int | None = None, dst_mag: int | None = None, overlap: int = 0, mask: GeoDataFrame | None = None, coords_only: bool = False, custom_coords: ndarray | None = None, threshold: float = 0.15, pil: bool = False) WSIPatcher

Create a patcher object for extracting patches from the WSI.

Parameters:
  • patch_size (int) – Size of each patch in pixels.

  • src_pixel_size (float, optional) – Source pixel size. Defaults to None.

  • dst_pixel_size (float, optional) – Destination pixel size. Defaults to None.

  • src_mag (int, optional) – Source magnification. Defaults to None.

  • dst_mag (int, optional) – Destination magnification. Defaults to None.

  • overlap (int, optional) – Overlap between patches in pixels. Defaults to 0.

  • mask (Optional[gpd.GeoDataFrame]) – Mask for patching. Defaults to None.

  • coords_only (bool, optional) – Whether to only return coordinates. Defaults to False.

  • custom_coords (Optional[np.ndarray]) – Custom coordinates to use. Defaults to None.

  • threshold (float, optional) – Threshold for tissue detection. Defaults to 0.15.

  • pil (bool, optional) – Whether to use PIL for image reading. Defaults to False.

Returns:

An object for extracting patches.

Return type:

WSIPatcher

Example

>>> patcher = wsi.create_patcher(patch_size=512, src_pixel_size=0.25, dst_pixel_size=0.5)
>>> for patch in patcher:
...     process(patch)
dump_patches(coords_path: str, save_patches_dir: str, max_patches: int = 0, image_format: str = 'png', jpeg_quality: int = 90) str

Dump patch images to disk for debugging/inspection.

This reads a Trident coords H5 file (or legacy coords if needed), iterates the corresponding patches, and writes them under save_patches_dir/<slide_name>/.

Parameters:
  • coords_path (str) – Path to a coords .h5 file produced by TRIDENT.

  • save_patches_dir (str) – Output directory to store patch images.

  • max_patches (int, optional) – If > 0, cap the number of patches written. Defaults to 0 (no cap).

  • image_format ({"png", "jpg"}, optional) – Image format to write. Defaults to “png”.

  • jpeg_quality (int, optional) – JPEG quality (1-100). Only used when image_format=”jpg”. Defaults to 90.

Returns:

Directory where patches were written.

Return type:

str

extract_patch_features(patch_encoder: Module, coords_path: str, save_features: str, device: str = 'cuda:0', saveas: str = 'h5', batch_limit: int = 512, verbose: bool = False) str

Extract feature embeddings from the WSI using a specified patch encoder.

Parameters:
  • patch_encoder (torch.nn.Module) – The model used for feature extraction.

  • coords_path (str) – Path to the file containing patch coordinates.

  • save_features (str) – Directory path to save the extracted features.

  • device (str, optional) – Device to run feature extraction on (e.g., ‘cuda:0’). Defaults to ‘cuda:0’.

  • saveas (str, optional) – Format to save the features (‘h5’ or ‘pt’). Defaults to ‘h5’.

  • batch_limit (int, optional) – Maximum batch size for feature extraction. Defaults to 512.

  • verbose (bool, optional) – Whether to print patch embedding progress. Defaults to False.

Returns:

The absolute file path to the saved feature file in the specified format.

Return type:

str

Example

>>> features_path = wsi.extract_features(patch_encoder, "output_coords/sample_name_patches.h5", "output_features")
>>> print(features_path)
output_features/sample_name.h5
extract_slide_features(patch_features_path: str, slide_encoder: Module, save_features: str, device: str = 'cuda') str

Extract slide-level features by encoding patch-level features using a pretrained slide encoder.

This function processes patch-level features extracted from a whole-slide image (WSI) and generates a single feature vector representing the entire slide. The extracted features are saved to a specified directory in HDF5 format.

Parameters:
  • patch_features_path (str) – Path to the HDF5 file containing patch-level features and coordinates.

  • slide_encoder (torch.nn.Module) – Pretrained slide encoder model for generating slide-level features.

  • save_features (str) – Directory where the extracted slide features will be saved.

  • device (str, optional) – Device to run computations on (e.g., ‘cuda’, ‘cpu’). Defaults to ‘cuda’.

Returns:

The absolute path to the slide-level features.

Return type:

str

Workflow:
  1. Load the pretrained slide encoder model and set it to evaluation mode.

  2. Load patch-level features and corresponding coordinates from the provided HDF5 file.

  3. Convert patch-level features into a tensor and move it to the specified device.

  4. Generate slide-level features using the slide encoder, with automatic mixed precision if supported.

  5. Save the slide-level features and associated metadata (e.g., coordinates) in an HDF5 file.

  6. Return the path to the saved slide features.

Raises:
  • FileNotFoundError – If the patch_features_path does not exist.

  • RuntimeError – If there is an issue with the slide encoder or tensor operations.

Example

>>> slide_features = extract_slide_features(
...     patch_features_path='path/to/patch_features.h5',
...     slide_encoder=pretrained_model,
...     save_features='output/slide_features',
...     device='cuda'
... )
>>> print(slide_features.shape)  # Outputs the shape of the slide-level feature vector.
extract_tissue_coords(target_mag: int, patch_size: int, save_coords: str, overlap: int = 0, min_tissue_proportion: float = 0.0) str

Extract patch coordinates from tissue regions in the WSI. It generates coordinates of patches at the specified magnification and saves the results in an HDF5 file.

Parameters:
  • target_mag (int) – Target magnification level for the patches.

  • patch_size (int) – Size of each patch at the target magnification.

  • save_coords (str) – Directory path to save the extracted coordinates.

  • overlap (int, optional) – Overlap between patches in pixels. Defaults to 0.

  • min_tissue_proportion (float, optional) – Minimum proportion of the patch under tissue to be kept. Defaults to 0.

Returns:

The absolute file path to the saved HDF5 file containing the patch coordinates.

Return type:

str

Example

>>> coords_path = wsi.extract_tissue_coords(20, 256, "output_coords", overlap=32)
>>> print(coords_path)
output_coords/patches/sample_name_patches.h5
get_best_level_and_custom_downsample(downsample: float, tolerance: float = 0.01) Tuple[int, float]

Determine the best level and custom downsample factor to approximate a desired downsample value.

Parameters:
  • downsample (float) – The desired downsample factor.

  • tolerance (float, optional) – Tolerance for rounding differences. Defaults to 0.01.

Returns:

The closest resolution level and the custom downsample factor.

Return type:

Tuple[int, float]

Raises:

ValueError – If no suitable resolution level is found for the specified downsample factor.

Example

>>> level, custom_downsample = wsi.get_best_level_and_custom_downsample(2.5)
>>> print(level, custom_downsample)
2, 1.1
get_dimensions() Tuple[int, int]

Return the dimensions (width, height) of the WSI.

Returns:

(width, height) in pixels.

Return type:

tuple[int, int]

get_thumbnail(size: tuple[int, int]) Image

Generate a thumbnail of the WSI.

Parameters:

size (tuple[int, int]) – Desired (width, height) of the thumbnail.

Returns:

RGB thumbnail as a PIL Image.

Return type:

PIL.Image.Image

read_region(location: Tuple[int, int], level: int, size: Tuple[int, int], read_as: Literal['pil', 'numpy'] = 'pil') Image | ndarray

Extract a specific region from the whole-slide image (WSI).

Parameters:
  • location (Tuple[int, int]) – (x, y) coordinates of the top-left corner of the region to extract.

  • level (int) – Pyramid level to read from.

  • size (Tuple[int, int]) – (width, height) of the region to extract.

  • read_as ({'pil', 'numpy'}, optional) – Output format for the region: - ‘pil’: returns a PIL Image (default) - ‘numpy’: returns a NumPy array (H, W, 3)

Returns:

Extracted image region in the specified format.

Return type:

Union[PIL.Image.Image, np.ndarray]

Raises:

ValueError – If read_as is not one of ‘pil’ or ‘numpy’.

Example

>>> region = wsi.read_region((0, 0), level=0, size=(512, 512), read_as='numpy')
>>> print(region.shape)
(512, 512, 3)
release() None

Release internal data (CPU/GPU/memory) and clear heavy references in the WSI instance. Call this method after you’re done processing to avoid memory/GPU leaks.

segment_semantic(segmentation_model: SegmentationModel, target_mag: int = 10, batch_size: int = 16, device: str = 'cuda:0', verbose=False, num_workers=None, collate_fn=None, inference_fn=None, return_contours=False) Tuple[ndarray, float] | Tuple[ndarray, float, GeoDataFrame]

Segment semantic regions in the WSI using a specified segmentation model.

Parameters:
  • segmentation_model (SegmentationModel) – The model used for tissue segmentation.

  • target_mag (int, optional) – Target magnification level for segmentation. Defaults to 10.

  • batch_size (int, optional) – Batch size for processing patches. Defaults to 16.

  • device (str) – The computation device to use (e.g., ‘cuda:0’ for GPU or ‘cpu’ for CPU).

  • verbose (bool, optional) – Whether to print segmentation progress. Defaults to False.

  • num_workers (Optional[int], optional) – Number of workers to use for the tile dataloader. If None, the number of workers is automatically inferred. Defaults to None.

  • collate_fn (optional) – Custom collate function used in the dataloader. It must return a dictionary containing at least xcoords and ycoords (level-0 coordinates), and img if inference_fn is not provided.

  • inference_fn (optional) – Function used during inference. Called as inference_fn(model, batch, device) where batch is the batch returned by collate_fn (if provided) or (img, (xcoords, ycoords)) otherwise. Must return a tensor with shape (B, H, W) and dtype uint8.

  • return_contours (bool, optional) – Whether to return the contours of each class in a GeoDataFrame. Defaults to False.

Returns:

A downscaled H x W np.ndarray containing class predictions and its downscale factor. If return_contours is True, also returns the contours of each class in a GeoDataFrame.

Return type:

Union[Tuple[np.ndarray, float], Tuple[np.ndarray, float, gpd.GeoDataFrame]]

Example

>>> wsi.segment_tissue(segmentation_model, target_mag=10, job_dir="output_dir")
>>> # Results saved in "output_dir"
segment_tissue(segmentation_model: SegmentationModel, target_mag: int = 10, holes_are_tissue: bool = True, job_dir: str | None = None, batch_size: int = 16, device: str = 'cuda:0', verbose=False, num_workers=None) str | GeoDataFrame

Segment tissue regions in the WSI using a specified segmentation model. It processes the WSI at a target magnification level, optionally treating holes in the mask as tissue. The segmented regions are saved as thumbnails and GeoJSON contours.

Parameters:
  • segmentation_model (SegmentationModel) – The model used for tissue segmentation.

  • target_mag (int, optional) – Target magnification level for segmentation. Defaults to 10.

  • holes_are_tissue (bool, optional) – Whether to treat holes in the mask as tissue. Defaults to True.

  • job_dir (Optional[str], optional) – Directory to save the segmentation results. If None, this method directly returns the contours as a GeoDataFrame without saving files. Defaults to None.

  • batch_size (int, optional) – Batch size for processing patches. Defaults to 16.

  • device (str) – The computation device to use (e.g., ‘cuda:0’ for GPU or ‘cpu’ for CPU).

  • verbose (bool, optional) – Whether to print segmentation progress. Defaults to False.

  • num_workers (Optional[int], optional) – Number of workers to use for the tile dataloader. If None, the number of workers is automatically inferred. Defaults to None.

Returns:

The absolute path to the GeoJSON if job_dir is not None; otherwise a GeoDataFrame.

Return type:

Union[str, gpd.GeoDataFrame]

Example

>>> wsi.segment_tissue(segmentation_model, target_mag=10, job_dir="output_dir")
>>> # Results saved in "output_dir"
visualize_coords(coords_path: str, save_patch_viz: str) str

Overlay patch coordinates onto a scaled thumbnail of the WSI.

Parameters:
  • coords_path (str) – Path to the file containing the patch coordinates.

  • save_patch_viz (str) – Directory path to save the visualization image.

Returns:

The file path to the saved visualization image.

Return type:

str

Example

>>> viz_path = wsi.visualize_coords("output_coords/sample_name_patches.h5", "output_viz")
>>> print(viz_path)
output_viz/sample_name.png
class trident.OpenSlideWSI(slide_path: str, **kwargs: Any)

Bases: WSI

Initialize an OpenSlideWSI instance.

Parameters:
  • slide_path (str) – Path to the WSI file.

  • **kwargs (dict) – Keyword arguments forwarded to the base WSI class. Most important key is: - lazy_init (bool, default=True): Whether to defer loading WSI and metadata.

Please refer to WSI constructor for all parameters.

Example

>>> wsi = OpenSlideWSI(slide_path="path/to/wsi.svs", lazy_init=False)
>>> print(wsi)
<width=100000, height=80000, backend=OpenSlideWSI, mpp=0.25, mag=40>
__init__(slide_path: str, **kwargs: Any) None

Initialize an OpenSlideWSI instance.

Parameters:
  • slide_path (str) – Path to the WSI file.

  • **kwargs (dict) – Keyword arguments forwarded to the base WSI class. Most important key is: - lazy_init (bool, default=True): Whether to defer loading WSI and metadata.

Please refer to WSI constructor for all parameters.

Example

>>> wsi = OpenSlideWSI(slide_path="path/to/wsi.svs", lazy_init=False)
>>> print(wsi)
<width=100000, height=80000, backend=OpenSlideWSI, mpp=0.25, mag=40>
create_patcher(patch_size: int, src_pixel_size: float | None = None, dst_pixel_size: float | None = None, src_mag: int | None = None, dst_mag: int | None = None, overlap: int = 0, mask: GeoDataFrame | None = None, coords_only: bool = False, custom_coords: ndarray | None = None, threshold: float = 0.15, pil: bool = False) WSIPatcher

Create a patcher object for extracting patches from the WSI.

Parameters:
  • patch_size (int) – Size of each patch in pixels.

  • src_pixel_size (float, optional) – Source pixel size. Defaults to None.

  • dst_pixel_size (float, optional) – Destination pixel size. Defaults to None.

  • src_mag (int, optional) – Source magnification. Defaults to None.

  • dst_mag (int, optional) – Destination magnification. Defaults to None.

  • overlap (int, optional) – Overlap between patches in pixels. Defaults to 0.

  • mask (Optional[gpd.GeoDataFrame]) – Mask for patching. Defaults to None.

  • coords_only (bool, optional) – Whether to only return coordinates. Defaults to False.

  • custom_coords (Optional[np.ndarray]) – Custom coordinates to use. Defaults to None.

  • threshold (float, optional) – Threshold for tissue detection. Defaults to 0.15.

  • pil (bool, optional) – Whether to use PIL for image reading. Defaults to False.

Returns:

An object for extracting patches.

Return type:

WSIPatcher

Example

>>> patcher = wsi.create_patcher(patch_size=512, src_pixel_size=0.25, dst_pixel_size=0.5)
>>> for patch in patcher:
...     process(patch)
dump_patches(coords_path: str, save_patches_dir: str, max_patches: int = 0, image_format: str = 'png', jpeg_quality: int = 90) str

Dump patch images to disk for debugging/inspection.

This reads a Trident coords H5 file (or legacy coords if needed), iterates the corresponding patches, and writes them under save_patches_dir/<slide_name>/.

Parameters:
  • coords_path (str) – Path to a coords .h5 file produced by TRIDENT.

  • save_patches_dir (str) – Output directory to store patch images.

  • max_patches (int, optional) – If > 0, cap the number of patches written. Defaults to 0 (no cap).

  • image_format ({"png", "jpg"}, optional) – Image format to write. Defaults to “png”.

  • jpeg_quality (int, optional) – JPEG quality (1-100). Only used when image_format=”jpg”. Defaults to 90.

Returns:

Directory where patches were written.

Return type:

str

extract_patch_features(patch_encoder: Module, coords_path: str, save_features: str, device: str = 'cuda:0', saveas: str = 'h5', batch_limit: int = 512, verbose: bool = False) str

Extract feature embeddings from the WSI using a specified patch encoder.

Parameters:
  • patch_encoder (torch.nn.Module) – The model used for feature extraction.

  • coords_path (str) – Path to the file containing patch coordinates.

  • save_features (str) – Directory path to save the extracted features.

  • device (str, optional) – Device to run feature extraction on (e.g., ‘cuda:0’). Defaults to ‘cuda:0’.

  • saveas (str, optional) – Format to save the features (‘h5’ or ‘pt’). Defaults to ‘h5’.

  • batch_limit (int, optional) – Maximum batch size for feature extraction. Defaults to 512.

  • verbose (bool, optional) – Whether to print patch embedding progress. Defaults to False.

Returns:

The absolute file path to the saved feature file in the specified format.

Return type:

str

Example

>>> features_path = wsi.extract_features(patch_encoder, "output_coords/sample_name_patches.h5", "output_features")
>>> print(features_path)
output_features/sample_name.h5
extract_slide_features(patch_features_path: str, slide_encoder: Module, save_features: str, device: str = 'cuda') str

Extract slide-level features by encoding patch-level features using a pretrained slide encoder.

This function processes patch-level features extracted from a whole-slide image (WSI) and generates a single feature vector representing the entire slide. The extracted features are saved to a specified directory in HDF5 format.

Parameters:
  • patch_features_path (str) – Path to the HDF5 file containing patch-level features and coordinates.

  • slide_encoder (torch.nn.Module) – Pretrained slide encoder model for generating slide-level features.

  • save_features (str) – Directory where the extracted slide features will be saved.

  • device (str, optional) – Device to run computations on (e.g., ‘cuda’, ‘cpu’). Defaults to ‘cuda’.

Returns:

The absolute path to the slide-level features.

Return type:

str

Workflow:
  1. Load the pretrained slide encoder model and set it to evaluation mode.

  2. Load patch-level features and corresponding coordinates from the provided HDF5 file.

  3. Convert patch-level features into a tensor and move it to the specified device.

  4. Generate slide-level features using the slide encoder, with automatic mixed precision if supported.

  5. Save the slide-level features and associated metadata (e.g., coordinates) in an HDF5 file.

  6. Return the path to the saved slide features.

Raises:
  • FileNotFoundError – If the patch_features_path does not exist.

  • RuntimeError – If there is an issue with the slide encoder or tensor operations.

Example

>>> slide_features = extract_slide_features(
...     patch_features_path='path/to/patch_features.h5',
...     slide_encoder=pretrained_model,
...     save_features='output/slide_features',
...     device='cuda'
... )
>>> print(slide_features.shape)  # Outputs the shape of the slide-level feature vector.
extract_tissue_coords(target_mag: int, patch_size: int, save_coords: str, overlap: int = 0, min_tissue_proportion: float = 0.0) str

Extract patch coordinates from tissue regions in the WSI. It generates coordinates of patches at the specified magnification and saves the results in an HDF5 file.

Parameters:
  • target_mag (int) – Target magnification level for the patches.

  • patch_size (int) – Size of each patch at the target magnification.

  • save_coords (str) – Directory path to save the extracted coordinates.

  • overlap (int, optional) – Overlap between patches in pixels. Defaults to 0.

  • min_tissue_proportion (float, optional) – Minimum proportion of the patch under tissue to be kept. Defaults to 0.

Returns:

The absolute file path to the saved HDF5 file containing the patch coordinates.

Return type:

str

Example

>>> coords_path = wsi.extract_tissue_coords(20, 256, "output_coords", overlap=32)
>>> print(coords_path)
output_coords/patches/sample_name_patches.h5
get_best_level_and_custom_downsample(downsample: float, tolerance: float = 0.01) Tuple[int, float]

Determine the best level and custom downsample factor to approximate a desired downsample value.

Parameters:
  • downsample (float) – The desired downsample factor.

  • tolerance (float, optional) – Tolerance for rounding differences. Defaults to 0.01.

Returns:

The closest resolution level and the custom downsample factor.

Return type:

Tuple[int, float]

Raises:

ValueError – If no suitable resolution level is found for the specified downsample factor.

Example

>>> level, custom_downsample = wsi.get_best_level_and_custom_downsample(2.5)
>>> print(level, custom_downsample)
2, 1.1
get_dimensions() Tuple[int, int]

Return the dimensions (width, height) of the WSI.

Returns:

(width, height) in pixels.

Return type:

tuple[int, int]

get_thumbnail(size: tuple[int, int]) Image

Generate a thumbnail of the WSI.

Parameters:

size (tuple[int, int]) – Desired (width, height) of the thumbnail.

Returns:

RGB thumbnail as a PIL Image.

Return type:

PIL.Image.Image

read_region(location: Tuple[int, int], level: int, size: Tuple[int, int], read_as: Literal['pil', 'numpy'] = 'pil') Image | ndarray

Extract a specific region from the whole-slide image (WSI).

Parameters:
  • location (Tuple[int, int]) – (x, y) coordinates of the top-left corner of the region to extract.

  • level (int) – Pyramid level to read from.

  • size (Tuple[int, int]) – (width, height) of the region to extract.

  • read_as ({'pil', 'numpy'}, optional) – Output format for the region: - ‘pil’: returns a PIL Image (default) - ‘numpy’: returns a NumPy array (H, W, 3)

Returns:

Extracted image region in the specified format.

Return type:

Union[PIL.Image.Image, np.ndarray]

Raises:

ValueError – If read_as is not one of ‘pil’ or ‘numpy’.

Example

>>> region = wsi.read_region((0, 0), level=0, size=(512, 512), read_as='numpy')
>>> print(region.shape)
(512, 512, 3)
release() None

Release internal data (CPU/GPU/memory) and clear heavy references in the WSI instance. Call this method after you’re done processing to avoid memory/GPU leaks.

segment_semantic(segmentation_model: SegmentationModel, target_mag: int = 10, batch_size: int = 16, device: str = 'cuda:0', verbose=False, num_workers=None, collate_fn=None, inference_fn=None, return_contours=False) Tuple[ndarray, float] | Tuple[ndarray, float, GeoDataFrame]

Segment semantic regions in the WSI using a specified segmentation model.

Parameters:
  • segmentation_model (SegmentationModel) – The model used for tissue segmentation.

  • target_mag (int, optional) – Target magnification level for segmentation. Defaults to 10.

  • batch_size (int, optional) – Batch size for processing patches. Defaults to 16.

  • device (str) – The computation device to use (e.g., ‘cuda:0’ for GPU or ‘cpu’ for CPU).

  • verbose (bool, optional) – Whether to print segmentation progress. Defaults to False.

  • num_workers (Optional[int], optional) – Number of workers to use for the tile dataloader. If None, the number of workers is automatically inferred. Defaults to None.

  • collate_fn (optional) – Custom collate function used in the dataloader. It must return a dictionary containing at least xcoords and ycoords (level-0 coordinates), and img if inference_fn is not provided.

  • inference_fn (optional) – Function used during inference. Called as inference_fn(model, batch, device) where batch is the batch returned by collate_fn (if provided) or (img, (xcoords, ycoords)) otherwise. Must return a tensor with shape (B, H, W) and dtype uint8.

  • return_contours (bool, optional) – Whether to return the contours of each class in a GeoDataFrame. Defaults to False.

Returns:

A downscaled H x W np.ndarray containing class predictions and its downscale factor. If return_contours is True, also returns the contours of each class in a GeoDataFrame.

Return type:

Union[Tuple[np.ndarray, float], Tuple[np.ndarray, float, gpd.GeoDataFrame]]

Example

>>> wsi.segment_tissue(segmentation_model, target_mag=10, job_dir="output_dir")
>>> # Results saved in "output_dir"
segment_tissue(segmentation_model: SegmentationModel, target_mag: int = 10, holes_are_tissue: bool = True, job_dir: str | None = None, batch_size: int = 16, device: str = 'cuda:0', verbose=False, num_workers=None) str | GeoDataFrame

Segment tissue regions in the WSI using a specified segmentation model. It processes the WSI at a target magnification level, optionally treating holes in the mask as tissue. The segmented regions are saved as thumbnails and GeoJSON contours.

Parameters:
  • segmentation_model (SegmentationModel) – The model used for tissue segmentation.

  • target_mag (int, optional) – Target magnification level for segmentation. Defaults to 10.

  • holes_are_tissue (bool, optional) – Whether to treat holes in the mask as tissue. Defaults to True.

  • job_dir (Optional[str], optional) – Directory to save the segmentation results. If None, this method directly returns the contours as a GeoDataFrame without saving files. Defaults to None.

  • batch_size (int, optional) – Batch size for processing patches. Defaults to 16.

  • device (str) – The computation device to use (e.g., ‘cuda:0’ for GPU or ‘cpu’ for CPU).

  • verbose (bool, optional) – Whether to print segmentation progress. Defaults to False.

  • num_workers (Optional[int], optional) – Number of workers to use for the tile dataloader. If None, the number of workers is automatically inferred. Defaults to None.

Returns:

The absolute path to the GeoJSON if job_dir is not None; otherwise a GeoDataFrame.

Return type:

Union[str, gpd.GeoDataFrame]

Example

>>> wsi.segment_tissue(segmentation_model, target_mag=10, job_dir="output_dir")
>>> # Results saved in "output_dir"
visualize_coords(coords_path: str, save_patch_viz: str) str

Overlay patch coordinates onto a scaled thumbnail of the WSI.

Parameters:
  • coords_path (str) – Path to the file containing the patch coordinates.

  • save_patch_viz (str) – Directory path to save the visualization image.

Returns:

The file path to the saved visualization image.

Return type:

str

Example

>>> viz_path = wsi.visualize_coords("output_coords/sample_name_patches.h5", "output_viz")
>>> print(viz_path)
output_viz/sample_name.png
class trident.OpenSlideWSIPatcher(*args, **kwargs)

Bases: WSIPatcher

Initialize patcher, compute number of (masked) rows, columns.

Parameters:
  • wsi (WSI) – WSI to patch.

  • patch_size (int) – Patch width/height in pixel on the slide after rescaling.

  • src_pixel_size (float, optional) – Pixel size in um/px of the slide before rescaling. Defaults to None. Deprecated, this argument will be removed in the next major version and will default to wsi.mpp.

  • dst_pixel_size (float, optional) – Pixel size in um/px of the slide after rescaling. Defaults to None. If both dst_mag and dst_pixel_size are not None, dst_pixel_size is used.

  • src_mag (int, optional) – Level0 magnification of the slide before rescaling. Defaults to None. Deprecated, this argument will be removed in the next major version and will default to wsi.mag.

  • dst_mag (int, optional) – Target magnification of the slide after rescaling. Defaults to None. If both dst_mag and dst_pixel_size are not None, dst_pixel_size is used.

  • overlap (int, optional) – Overlap between patches in pixels. Defaults to 0.

  • mask (gpd.GeoDataFrame, optional) – GeoPandas dataframe of Polygons. Defaults to None.

  • coords_only (bool, optional) – Whether to extract only the coordinates instead of coordinates + tile. Defaults to False.

  • custom_coords (array-like, optional) – Custom coordinates to use. Defaults to None.

  • threshold (float, optional) – Minimum proportion of the patch under tissue to be kept. This argument is ignored if mask is None; passing threshold=0 will be faster. Defaults to 0.0.

  • pil (bool, optional) – Whether to get patches as PIL.Image (numpy array by default). Defaults to False.

  • scan_order ({"row-major", "col-major"}, optional) – Scan order used when generating the default coordinate grid (only when custom_coords is None). - “row-major”: iterate row by row (Y -> X). Typically best for disk locality on tiled WSI formats. - “col-major”: iterate column by column (X -> Y). Legacy behavior.

classmethod from_legacy_coords(wsi, patch_size, patch_level, custom_downsample, coords, coords_only=False, pil=False) WSIPatcher

Create a WSIPatcher from legacy coordinates parameters generated with CLAM or Fishing-Rod. These legacy coordinates parameters include: custom_downsample and patch_level instead of the new patch_size and dst_mag/dst_mpp format.

Parameters:
  • wsi (WSI) – WSI to patch.

  • patch_size (int) – The target patch size at the desired magnification.

  • patch_level (int) – The patch level used when reading the slide.

  • custom_downsample (int) – Any additional downsampling applied to the patches.

  • coords (np.array) – An array of patch coordinates.

  • coords_only (bool, optional) – Whether to extract only coordinates. Defaults to False.

  • pil (bool, optional) – Whether to get patches as PIL.Image. Defaults to False.

Returns:

WSIPatcher created from the given legacy coordinates.

Return type:

WSIPatcher

classmethod from_legacy_coords_file(wsi, coords_path, coords_only=False, pil=False) WSIPatcher

Create a WSIPatcher from a legacy coordinates file generated with CLAM or Fishing-Rod.

Parameters:
  • wsi (WSI) – WSI to patch.

  • coords_path (str) – Path to legacy coordinates stored as .h5.

  • coords_only (bool, optional) – Whether the legacy coordinates file only contains coordinates or if it also contains images. Defaults to False.

  • pil (bool, optional) – PIL argument passed to the WSIPatcher constructor. Defaults to False.

Returns:

WSIPatcher created from the given legacy coordinates.

Return type:

WSIPatcher

get_cols_rows() Tuple[int, int]

Get the number of columns and rows in the associated WSI.

Returns:

(nb_columns, nb_rows).

Return type:

Tuple[int, int]

get_tile(col: int, row: int) Tuple[ndarray, int, int]

Get tile at position (column, row).

Parameters:
  • col (int) – Column.

  • row (int) – Row.

Returns:

(tile, pixel x of top-left corner (before rescaling), pixel_y of top-left corner (before rescaling)).

Return type:

Tuple[np.ndarray, int, int]

get_tile_xy(x: int, y: int) Tuple[ndarray, int, int]
visualize() Image

Overlay patch coordinates computed by the WSIPatcher onto a scaled thumbnail of the WSI. It creates a visualization of the patcher coordinates and returns it as an image.

Returns:

Patch visualization.

Return type:

Image.Image

Example

>>> img = wsi_patcher.visualize()
>>> img.save('test_vis.jpg')
class trident.Processor(job_dir: str, wsi_source: str, wsi_ext: List[str] | None = None, wsi_cache: str | None = None, clear_cache: bool = False, skip_errors: bool = False, custom_mpp_keys: List[str] | None = None, custom_list_of_wsis: str | None = None, max_workers: int | None = None, reader_type: Literal['openslide', 'image', 'cucim', 'sdpc', 'omezarr', 'czi'] | None = None, search_nested: bool = False, selected_wsi_paths: List[str] | None = None)

Bases: object

The Processor class handles all preprocessing steps starting from whole-slide images (WSIs).

Available methods:
  • run_segmentation_job: Performs tissue segmentation on all slides managed by the processor.

  • run_patching_job: Extracts patch coordinates from the segmented tissue regions of slides.

  • run_patch_feature_extraction_job: Extracts patch-level features using a specified patch encoder.
    • Deprecated alias: run_feature_extraction_job

  • run_slide_feature_extraction_job: Extracts slide-level features using a specified slide encoder.

Parameters:
  • job_dir (str) – The directory where the results of processing, including segmentations, patches, and extracted features, will be saved. This should be an existing directory with sufficient storage.

  • wsi_source (str) – The directory containing the WSIs to be processed. This can either be a local directory or a network-mounted drive. All slides in this directory matching the specified file extensions will be considered for processing.

  • wsi_ext (List[str]) – A list of accepted WSI file extensions, such as [‘.ndpi’, ‘.svs’]. This allows for filtering slides based on their format. If set to None, a default list of common extensions will be used. Defaults to None.

  • wsi_cache (str, optional) – [DEPRECATED as of v0.2.0] An optional directory for caching WSIs locally. If specified, slides will be copied from the source directory to this local directory before processing, improving performance when the source is a network drive. Defaults to None.

  • clear_cache (bool, optional) – [DEPRECATED as of v0.2.0] A flag indicating whether slides in the cache should be deleted after processing. This helps manage storage space. Defaults to False.

  • skip_errors (bool, optional) – A flag specifying whether to continue processing if an error occurs on a slide. If set to False, the process will stop on the first error. Defaults to False.

  • custom_mpp_keys (List[str], optional) – A list of custom keys in the slide metadata for retrieving the microns per pixel (MPP) value. If not provided, standard keys will be used. Defaults to None.

  • custom_list_of_wsis (str, optional) – Path to a csv file with a custom list of WSIs to process in a field called ‘wsi’ (including extensions). If provided, only these slides will be considered for processing. Defaults to None, which means all slides matching the wsi_ext extensions will be processed. Note: If custom_list_of_wsis is provided, any names that do not match the available slides will be ignored, and a warning will be printed.

  • max_workers (int, optional) – Maximum number of workers for data loading. If None, the default behavior will be used. Defaults to None.

  • reader_type (WSIReaderType, optional) – Force the image reader engine to use. Options are are [“openslide”, “image”, “cucim”, “sdpc”, “omezarr”]. Defaults to None (auto-determine the right engine based on image extension).

  • search_nested (bool, optional) – If True, the processor will recursively search for WSIs within all subdirectories of wsi_source. All matching files (based on wsi_ext) found at any depth within the directory tree will be included. Each slide will be identified by its relative path to wsi_source, but only the filename (excluding directory structure) will be used for downstream outputs (e.g., segmentation filenames). If False, only files directly inside wsi_source will be considered. Defaults to False.

  • selected_wsi_paths (List[str], optional) – Optional explicit list of slide paths to process. When provided, slide discovery is skipped and only these slides are loaded. Defaults to None.

Returns:

This method initializes the class instance and sets up the environment for processing.

Return type:

None

Example

Initialize the Processor for a directory of WSIs:

>>> processor = Processor(
...     job_dir="results/",
...     wsi_source="data/slides/",
...     wsi_ext=[".svs", ".ndpi"],
... )
>>> print(f"Processor initialized for {len(processor.wsis)} slides.")
Raises:

AssertionError – If wsi_ext is not a list or if any extension does not start with a period.

__init__(job_dir: str, wsi_source: str, wsi_ext: List[str] | None = None, wsi_cache: str | None = None, clear_cache: bool = False, skip_errors: bool = False, custom_mpp_keys: List[str] | None = None, custom_list_of_wsis: str | None = None, max_workers: int | None = None, reader_type: Literal['openslide', 'image', 'cucim', 'sdpc', 'omezarr', 'czi'] | None = None, search_nested: bool = False, selected_wsi_paths: List[str] | None = None) None

The Processor class handles all preprocessing steps starting from whole-slide images (WSIs).

Available methods:
  • run_segmentation_job: Performs tissue segmentation on all slides managed by the processor.

  • run_patching_job: Extracts patch coordinates from the segmented tissue regions of slides.

  • run_patch_feature_extraction_job: Extracts patch-level features using a specified patch encoder.
    • Deprecated alias: run_feature_extraction_job

  • run_slide_feature_extraction_job: Extracts slide-level features using a specified slide encoder.

Parameters:
  • job_dir (str) – The directory where the results of processing, including segmentations, patches, and extracted features, will be saved. This should be an existing directory with sufficient storage.

  • wsi_source (str) – The directory containing the WSIs to be processed. This can either be a local directory or a network-mounted drive. All slides in this directory matching the specified file extensions will be considered for processing.

  • wsi_ext (List[str]) – A list of accepted WSI file extensions, such as [‘.ndpi’, ‘.svs’]. This allows for filtering slides based on their format. If set to None, a default list of common extensions will be used. Defaults to None.

  • wsi_cache (str, optional) – [DEPRECATED as of v0.2.0] An optional directory for caching WSIs locally. If specified, slides will be copied from the source directory to this local directory before processing, improving performance when the source is a network drive. Defaults to None.

  • clear_cache (bool, optional) – [DEPRECATED as of v0.2.0] A flag indicating whether slides in the cache should be deleted after processing. This helps manage storage space. Defaults to False.

  • skip_errors (bool, optional) – A flag specifying whether to continue processing if an error occurs on a slide. If set to False, the process will stop on the first error. Defaults to False.

  • custom_mpp_keys (List[str], optional) – A list of custom keys in the slide metadata for retrieving the microns per pixel (MPP) value. If not provided, standard keys will be used. Defaults to None.

  • custom_list_of_wsis (str, optional) – Path to a csv file with a custom list of WSIs to process in a field called ‘wsi’ (including extensions). If provided, only these slides will be considered for processing. Defaults to None, which means all slides matching the wsi_ext extensions will be processed. Note: If custom_list_of_wsis is provided, any names that do not match the available slides will be ignored, and a warning will be printed.

  • max_workers (int, optional) – Maximum number of workers for data loading. If None, the default behavior will be used. Defaults to None.

  • reader_type (WSIReaderType, optional) – Force the image reader engine to use. Options are are [“openslide”, “image”, “cucim”, “sdpc”, “omezarr”]. Defaults to None (auto-determine the right engine based on image extension).

  • search_nested (bool, optional) – If True, the processor will recursively search for WSIs within all subdirectories of wsi_source. All matching files (based on wsi_ext) found at any depth within the directory tree will be included. Each slide will be identified by its relative path to wsi_source, but only the filename (excluding directory structure) will be used for downstream outputs (e.g., segmentation filenames). If False, only files directly inside wsi_source will be considered. Defaults to False.

  • selected_wsi_paths (List[str], optional) – Optional explicit list of slide paths to process. When provided, slide discovery is skipped and only these slides are loaded. Defaults to None.

Returns:

This method initializes the class instance and sets up the environment for processing.

Return type:

None

Example

Initialize the Processor for a directory of WSIs:

>>> processor = Processor(
...     job_dir="results/",
...     wsi_source="data/slides/",
...     wsi_ext=[".svs", ".ndpi"],
... )
>>> print(f"Processor initialized for {len(processor.wsis)} slides.")
Raises:

AssertionError – If wsi_ext is not a list or if any extension does not start with a period.

release() None

Release all resources tied to the WSIs held by this Processor instance. Frees memory, closes file handles, and clears GPU memory. Should be called after processing is complete to avoid memory leaks.

run_feature_extraction_job(coords_dir: str, patch_encoder: torch.nn.Module, device: str, saveas: str = 'h5', batch_limit: int = 512, saveto: str | None = None) str
run_patch_feature_extraction_job(coords_dir: str, patch_encoder: torch.nn.Module, device: str, saveas: str = 'h5', batch_limit: int = 512, saveto: str | None = None) str

The run_feature_extraction_job function computes features from the patches generated during the patching step. These features are extracted using a deep learning model and saved in a specified format. This step is often used in workflows that involve downstream analysis, such as classification or clustering.

Parameters:
  • coords_dir (str) – Path to the directory containing patch coordinates, which are used to locate patches for feature extraction.

  • patch_encoder (torch.nn.Module) – A pre-trained PyTorch model used to compute features from the extracted patches.

  • device (str) – The computation device to use (e.g., ‘cuda:0’ for GPU or ‘cpu’ for CPU).

  • saveas (str, optional) – The format in which extracted features are saved. Can be ‘h5’ or ‘pt’. Defaults to ‘h5’.

  • batch_limit (int, optional) – The maximum number of patches processed in a single batch. Defaults to 512.

  • saveto (str, optional) – Directory where the extracted features will be saved. If not provided, a directory name will be generated automatically. Defaults to None.

Returns:

The absolute path to where the features are saved.

Return type:

str

Example

Extract features from patches using a pre-trained encoder:

>>> from models import PatchEncoder
>>> encoder = PatchEncoder()
>>> processor.run_feature_extraction_job(
...     coords_dir="output/patch_coords/",
...     patch_encoder=encoder,
...     device="cuda:0"
... )
run_patching_job(target_magnification: int, patch_size: int, overlap: int = 0, saveto: str | None = None, visualize: bool = True, min_tissue_proportion: float = 0.0, dump_patches: bool = False, dump_patches_max: int = 0, dump_patches_format: str = 'png', dump_patches_jpeg_quality: int = 90) str

The run_patching_job function extracts patches from the segmented tissue regions of slides. These patches are saved as coordinates in an h5 file for each slide.

Parameters:
  • target_magnification (int) – The magnification level for extracting patches. Higher magnifications result in smaller but more detailed patches.

  • patch_size (int) – The size of each patch in pixels. This refers to the dimensions of the patch at the target magnification.

  • overlap (int, optional) – The amount of overlap between adjacent patches, specified in pixels. Defaults to 0.

  • saveto (str, optional) – The directory where patch data and visualizations will be saved (relative to job_dir). If not provided, a directory name will be generated automatically. Defaults to None.

  • visualize (bool, optional) – Whether to generate and save visualizations of the patches. Defaults to True.

  • min_tissue_proportion (float, optional) – Minimum proportion of the patch under tissue to be kept. Defaults to 0.

  • dump_patches (bool, optional) – If True, also writes patch images to disk under <saveto>/patch_images/<slide_name>/ for debugging. Defaults to False.

  • dump_patches_max (int, optional) – Maximum number of patch images to write per slide (0 = no limit). Defaults to 0.

  • dump_patches_format (str, optional) – Image format for dumped patches: png or jpg. Defaults to png.

  • dump_patches_jpeg_quality (int, optional) – JPEG quality (1-100) when dump_patches_format is jpg. Defaults to 90.

Returns:

Absolute path to directory containing patch coordinates.

Return type:

str

Example

Extract patches with a size of 256x256 pixels at 20x magnification:

>>> processor.run_patching_job(
...     target_magnification=20,
...     patch_size=256,
...     overlap=32,
...     saveto="output/patches/"
... )
run_segmentation_job(segmentation_model: torch.nn.Module, seg_mag: int = 10, holes_are_tissue: bool = False, batch_size: int = 16, artifact_remover_model: torch.nn.Module = None, device: str = 'cuda:0') str

The run_segmentation_job function performs tissue segmentation on all slides managed by the processor. It uses a machine learning model to identify tissue regions and saves the resulting segmentations to the output directory. This function is essential for workflows that require detailed tissue delineation.

Parameters:
  • segmentation_model (torch.nn.Module) – A pre-trained PyTorch model that performs the tissue segmentation. This model should be compatible with the expected input data format of WSIs.

  • seg_mag (int, optional) – The magnification level at which segmentation is performed. For example, a value of 10 indicates 10x magnification. Defaults to 10.

  • holes_are_tissue (bool, optional) – Specifies whether to treat holes within tissue regions as part of the tissue. Defaults to False.

  • batch_size (int, optional) – The batch size for segmentation. Defaults to 16.

  • artifact_remover_model (torch.nn.Module, optional) – A pre-trained PyTorch model that can remove artifacts from an existing segmentation. Defaults to None.

  • device (str) – The computation device to use (e.g., ‘cuda:0’ for GPU or ‘cpu’ for CPU).

Returns:

Absolute path to where directory containing contours is saved.

Return type:

str

Example

Run a segmentation job with a pre-trained model:

>>> from segmentation.models import TissueSegmenter
>>> model = TissueSegmenter()
>>> processor.run_segmentation_job(segmentation_model=model, seg_mag=20)
run_slide_feature_extraction_job(slide_encoder: torch.nn.Module, coords_dir: str, device: str = 'cuda', batch_limit: int = 512, saveas: str = 'h5', saveto: str | None = None) None

Extract slide-level features from whole-slide images (WSIs) using a specified slide encoder.

This function generates embeddings for WSIs by first ensuring that patch-level features required for the slide encoder are available. If patch features are missing, they are extracted using an appropriate patch encoder automatically inferred. The extracted slide features are saved in the specified format and directory.

Parameters:
  • slide_encoder (torch.nn.Module) – The slide encoder model used for generating slide-level features from patch-level features.

  • coords_dir (str) – Directory containing coordinates and features required for processing WSIs.

  • device (str, optional) – Device to use for computations (e.g., ‘cuda’, ‘cpu’). Defaults to ‘cuda’.

  • batch_limit (int, optional) – Maximum number of features processed in a batch during patch feature extraction. Defaults to 512.

  • saveas (str, optional) – File format to save slide features (e.g., ‘h5’). Defaults to ‘h5’.

  • saveto (str | None, optional) – Directory to save extracted slide features. If None, the directory is auto-generated based on coords_dir and slide_encoder. Defaults to None.

Returns:

The absolute path to where the slide embeddings are saved.

Return type:

str

Workflow:
  1. Verify the compatibility of the slide encoder and patch features.

  2. Check if patch-level features are already extracted for all WSIs. If not, extract them.

  3. Save the configuration for slide feature extraction to maintain reproducibility.

  4. Process each WSI:
    • Skip if patch features required for the WSI are missing.

    • Extract slide features, ensuring proper synchronization in multiprocessing setups.

  5. Log the progress and errors during processing.

Notes

  • Patch features are expected in a specific directory structure under coords_dir.

  • Slide features are saved in the format specified by saveas.

  • Errors can be optionally skipped based on the self.skip_errors attribute.

Raises:

Exception – Propagates exceptions unless self.skip_errors is set to True.

save_config(saveto: str, local_attrs: Dict[str, Any] | None = None, ignore: List[str] = ['valid_slides']) None

The save_config function saves the current configuration of the Processor instance to a JSON file. This configuration includes attributes of the instance as well as optional additional parameters provided via the local_attrs argument.

The function filters out attributes specified in the ignore list and ensures that only JSON-serializable attributes are included. This makes it ideal for saving configurations in a structured format that can later be reloaded or inspected for reproducibility.

Parameters:
  • saveto (str) – The path to the file where the configuration will be saved. This should include the file extension (e.g., “config.json”).

  • local_attrs (dict, optional) – A dictionary of additional attributes to include in the configuration. This can be used to add method-specific parameters or runtime settings. Defaults to None.

  • ignore (list, optional) – A list of attribute names to exclude from the configuration. This is useful for omitting large or non-serializable objects. Defaults to [‘valid_slides’].

Returns:

The function saves the configuration to the specified file and does not return any value.

Return type:

None

Example

Save the current processor configuration to a file:

>>> processor.save_config(saveto="output/config.json")
>>> # Check the saved configuration
>>> with open("output/config.json", "r") as f:
...     config = json.load(f)
...     print(config)
class trident.SDPCWSI(slide_path, **kwargs)

Bases: WSI

Initialize an SDPCWSI instance.

Parameters:
  • slide_path (str) – Path to the WSI file.

  • **kwargs (dict) – Keyword arguments forwarded to the base WSI class. Most important key is: - lazy_init (bool, default=True): Whether to defer loading WSI and metadata.

Please refer to WSI constructor for all parameters.

Example

>>> wsi = SDPCWSI(slide_path="path/to/wsi.svs", lazy_init=False)
>>> print(wsi)
<width=100000, height=80000, backend=SDPCWSI, mpp=0.25, mag=40>
__init__(slide_path, **kwargs) None

Initialize an SDPCWSI instance.

Parameters:
  • slide_path (str) – Path to the WSI file.

  • **kwargs (dict) – Keyword arguments forwarded to the base WSI class. Most important key is: - lazy_init (bool, default=True): Whether to defer loading WSI and metadata.

Please refer to WSI constructor for all parameters.

Example

>>> wsi = SDPCWSI(slide_path="path/to/wsi.svs", lazy_init=False)
>>> print(wsi)
<width=100000, height=80000, backend=SDPCWSI, mpp=0.25, mag=40>
create_patcher(patch_size: int, src_pixel_size: float | None = None, dst_pixel_size: float | None = None, src_mag: int | None = None, dst_mag: int | None = None, overlap: int = 0, mask: GeoDataFrame | None = None, coords_only: bool = False, custom_coords: ndarray | None = None, threshold: float = 0.15, pil: bool = False) WSIPatcher

Create a patcher object for extracting patches from the WSI.

Parameters:
  • patch_size (int) – Size of each patch in pixels.

  • src_pixel_size (float, optional) – Source pixel size. Defaults to None.

  • dst_pixel_size (float, optional) – Destination pixel size. Defaults to None.

  • src_mag (int, optional) – Source magnification. Defaults to None.

  • dst_mag (int, optional) – Destination magnification. Defaults to None.

  • overlap (int, optional) – Overlap between patches in pixels. Defaults to 0.

  • mask (Optional[gpd.GeoDataFrame]) – Mask for patching. Defaults to None.

  • coords_only (bool, optional) – Whether to only return coordinates. Defaults to False.

  • custom_coords (Optional[np.ndarray]) – Custom coordinates to use. Defaults to None.

  • threshold (float, optional) – Threshold for tissue detection. Defaults to 0.15.

  • pil (bool, optional) – Whether to use PIL for image reading. Defaults to False.

Returns:

An object for extracting patches.

Return type:

WSIPatcher

Example

>>> patcher = wsi.create_patcher(patch_size=512, src_pixel_size=0.25, dst_pixel_size=0.5)
>>> for patch in patcher:
...     process(patch)
dump_patches(coords_path: str, save_patches_dir: str, max_patches: int = 0, image_format: str = 'png', jpeg_quality: int = 90) str

Dump patch images to disk for debugging/inspection.

This reads a Trident coords H5 file (or legacy coords if needed), iterates the corresponding patches, and writes them under save_patches_dir/<slide_name>/.

Parameters:
  • coords_path (str) – Path to a coords .h5 file produced by TRIDENT.

  • save_patches_dir (str) – Output directory to store patch images.

  • max_patches (int, optional) – If > 0, cap the number of patches written. Defaults to 0 (no cap).

  • image_format ({"png", "jpg"}, optional) – Image format to write. Defaults to “png”.

  • jpeg_quality (int, optional) – JPEG quality (1-100). Only used when image_format=”jpg”. Defaults to 90.

Returns:

Directory where patches were written.

Return type:

str

extract_patch_features(patch_encoder: Module, coords_path: str, save_features: str, device: str = 'cuda:0', saveas: str = 'h5', batch_limit: int = 512, verbose: bool = False) str

Extract feature embeddings from the WSI using a specified patch encoder.

Parameters:
  • patch_encoder (torch.nn.Module) – The model used for feature extraction.

  • coords_path (str) – Path to the file containing patch coordinates.

  • save_features (str) – Directory path to save the extracted features.

  • device (str, optional) – Device to run feature extraction on (e.g., ‘cuda:0’). Defaults to ‘cuda:0’.

  • saveas (str, optional) – Format to save the features (‘h5’ or ‘pt’). Defaults to ‘h5’.

  • batch_limit (int, optional) – Maximum batch size for feature extraction. Defaults to 512.

  • verbose (bool, optional) – Whether to print patch embedding progress. Defaults to False.

Returns:

The absolute file path to the saved feature file in the specified format.

Return type:

str

Example

>>> features_path = wsi.extract_features(patch_encoder, "output_coords/sample_name_patches.h5", "output_features")
>>> print(features_path)
output_features/sample_name.h5
extract_slide_features(patch_features_path: str, slide_encoder: Module, save_features: str, device: str = 'cuda') str

Extract slide-level features by encoding patch-level features using a pretrained slide encoder.

This function processes patch-level features extracted from a whole-slide image (WSI) and generates a single feature vector representing the entire slide. The extracted features are saved to a specified directory in HDF5 format.

Parameters:
  • patch_features_path (str) – Path to the HDF5 file containing patch-level features and coordinates.

  • slide_encoder (torch.nn.Module) – Pretrained slide encoder model for generating slide-level features.

  • save_features (str) – Directory where the extracted slide features will be saved.

  • device (str, optional) – Device to run computations on (e.g., ‘cuda’, ‘cpu’). Defaults to ‘cuda’.

Returns:

The absolute path to the slide-level features.

Return type:

str

Workflow:
  1. Load the pretrained slide encoder model and set it to evaluation mode.

  2. Load patch-level features and corresponding coordinates from the provided HDF5 file.

  3. Convert patch-level features into a tensor and move it to the specified device.

  4. Generate slide-level features using the slide encoder, with automatic mixed precision if supported.

  5. Save the slide-level features and associated metadata (e.g., coordinates) in an HDF5 file.

  6. Return the path to the saved slide features.

Raises:
  • FileNotFoundError – If the patch_features_path does not exist.

  • RuntimeError – If there is an issue with the slide encoder or tensor operations.

Example

>>> slide_features = extract_slide_features(
...     patch_features_path='path/to/patch_features.h5',
...     slide_encoder=pretrained_model,
...     save_features='output/slide_features',
...     device='cuda'
... )
>>> print(slide_features.shape)  # Outputs the shape of the slide-level feature vector.
extract_tissue_coords(target_mag: int, patch_size: int, save_coords: str, overlap: int = 0, min_tissue_proportion: float = 0.0) str

Extract patch coordinates from tissue regions in the WSI. It generates coordinates of patches at the specified magnification and saves the results in an HDF5 file.

Parameters:
  • target_mag (int) – Target magnification level for the patches.

  • patch_size (int) – Size of each patch at the target magnification.

  • save_coords (str) – Directory path to save the extracted coordinates.

  • overlap (int, optional) – Overlap between patches in pixels. Defaults to 0.

  • min_tissue_proportion (float, optional) – Minimum proportion of the patch under tissue to be kept. Defaults to 0.

Returns:

The absolute file path to the saved HDF5 file containing the patch coordinates.

Return type:

str

Example

>>> coords_path = wsi.extract_tissue_coords(20, 256, "output_coords", overlap=32)
>>> print(coords_path)
output_coords/patches/sample_name_patches.h5
get_best_level_and_custom_downsample(downsample: float, tolerance: float = 0.01) Tuple[int, float]

Determine the best level and custom downsample factor to approximate a desired downsample value.

Parameters:
  • downsample (float) – The desired downsample factor.

  • tolerance (float, optional) – Tolerance for rounding differences. Defaults to 0.01.

Returns:

The closest resolution level and the custom downsample factor.

Return type:

Tuple[int, float]

Raises:

ValueError – If no suitable resolution level is found for the specified downsample factor.

Example

>>> level, custom_downsample = wsi.get_best_level_and_custom_downsample(2.5)
>>> print(level, custom_downsample)
2, 1.1
get_dimensions() Tuple[int, int]

Return the dimensions (width, height) of the WSI.

Returns:

(width, height) in pixels.

Return type:

tuple[int, int]

get_thumbnail(size: tuple[int, int]) Image

Generate a thumbnail of the WSI.

Parameters:

size (tuple[int, int]) – Desired (width, height) of the thumbnail.

Returns:

RGB thumbnail as a PIL Image.

Return type:

PIL.Image.Image

read_region(location: Tuple[int, int], level: int, size: Tuple[int, int], read_as: Literal['pil', 'numpy'] = 'pil') Image | ndarray

Extract a specific region from the whole-slide image (WSI).

Parameters:
  • location (Tuple[int, int]) – (x, y) coordinates of the top-left corner of the region to extract.

  • level (int) – Pyramid level to read from.

  • size (Tuple[int, int]) – (width, height) of the region to extract.

  • read_as ({'pil', 'numpy'}, optional) – Output format for the region: - ‘pil’: returns a PIL Image (default) - ‘numpy’: returns a NumPy array (H, W, 3)

Returns:

Extracted image region in the specified format.

Return type:

Union[PIL.Image.Image, np.ndarray]

Raises:

ValueError – If read_as is not one of ‘pil’ or ‘numpy’.

Example

>>> region = wsi.read_region((0, 0), level=0, size=(512, 512), read_as='numpy')
>>> print(region.shape)
(512, 512, 3)
release() None

Release internal data (CPU/GPU/memory) and clear heavy references in the WSI instance. Call this method after you’re done processing to avoid memory/GPU leaks.

segment_semantic(segmentation_model: SegmentationModel, target_mag: int = 10, batch_size: int = 16, device: str = 'cuda:0', verbose=False, num_workers=None, collate_fn=None, inference_fn=None, return_contours=False) Tuple[ndarray, float] | Tuple[ndarray, float, GeoDataFrame]

Segment semantic regions in the WSI using a specified segmentation model.

Parameters:
  • segmentation_model (SegmentationModel) – The model used for tissue segmentation.

  • target_mag (int, optional) – Target magnification level for segmentation. Defaults to 10.

  • batch_size (int, optional) – Batch size for processing patches. Defaults to 16.

  • device (str) – The computation device to use (e.g., ‘cuda:0’ for GPU or ‘cpu’ for CPU).

  • verbose (bool, optional) – Whether to print segmentation progress. Defaults to False.

  • num_workers (Optional[int], optional) – Number of workers to use for the tile dataloader. If None, the number of workers is automatically inferred. Defaults to None.

  • collate_fn (optional) – Custom collate function used in the dataloader. It must return a dictionary containing at least xcoords and ycoords (level-0 coordinates), and img if inference_fn is not provided.

  • inference_fn (optional) – Function used during inference. Called as inference_fn(model, batch, device) where batch is the batch returned by collate_fn (if provided) or (img, (xcoords, ycoords)) otherwise. Must return a tensor with shape (B, H, W) and dtype uint8.

  • return_contours (bool, optional) – Whether to return the contours of each class in a GeoDataFrame. Defaults to False.

Returns:

A downscaled H x W np.ndarray containing class predictions and its downscale factor. If return_contours is True, also returns the contours of each class in a GeoDataFrame.

Return type:

Union[Tuple[np.ndarray, float], Tuple[np.ndarray, float, gpd.GeoDataFrame]]

Example

>>> wsi.segment_tissue(segmentation_model, target_mag=10, job_dir="output_dir")
>>> # Results saved in "output_dir"
segment_tissue(segmentation_model: SegmentationModel, target_mag: int = 10, holes_are_tissue: bool = True, job_dir: str | None = None, batch_size: int = 16, device: str = 'cuda:0', verbose=False, num_workers=None) str | GeoDataFrame

Segment tissue regions in the WSI using a specified segmentation model. It processes the WSI at a target magnification level, optionally treating holes in the mask as tissue. The segmented regions are saved as thumbnails and GeoJSON contours.

Parameters:
  • segmentation_model (SegmentationModel) – The model used for tissue segmentation.

  • target_mag (int, optional) – Target magnification level for segmentation. Defaults to 10.

  • holes_are_tissue (bool, optional) – Whether to treat holes in the mask as tissue. Defaults to True.

  • job_dir (Optional[str], optional) – Directory to save the segmentation results. If None, this method directly returns the contours as a GeoDataFrame without saving files. Defaults to None.

  • batch_size (int, optional) – Batch size for processing patches. Defaults to 16.

  • device (str) – The computation device to use (e.g., ‘cuda:0’ for GPU or ‘cpu’ for CPU).

  • verbose (bool, optional) – Whether to print segmentation progress. Defaults to False.

  • num_workers (Optional[int], optional) – Number of workers to use for the tile dataloader. If None, the number of workers is automatically inferred. Defaults to None.

Returns:

The absolute path to the GeoJSON if job_dir is not None; otherwise a GeoDataFrame.

Return type:

Union[str, gpd.GeoDataFrame]

Example

>>> wsi.segment_tissue(segmentation_model, target_mag=10, job_dir="output_dir")
>>> # Results saved in "output_dir"
visualize_coords(coords_path: str, save_patch_viz: str) str

Overlay patch coordinates onto a scaled thumbnail of the WSI.

Parameters:
  • coords_path (str) – Path to the file containing the patch coordinates.

  • save_patch_viz (str) – Directory path to save the visualization image.

Returns:

The file path to the saved visualization image.

Return type:

str

Example

>>> viz_path = wsi.visualize_coords("output_coords/sample_name_patches.h5", "output_viz")
>>> print(viz_path)
output_viz/sample_name.png
class trident.WSIPatcher(wsi: Any, patch_size: int, src_pixel_size: float | None = None, dst_pixel_size: float | None = None, src_mag: int | None = None, dst_mag: int | None = None, overlap: int = 0, mask: GeoDataFrame | None = None, coords_only: bool = False, custom_coords: Any | None = None, threshold: float = 0.0, pil: bool = False, scan_order: Literal['row-major', 'col-major'] = 'row-major')

Bases: object

Iterator class for extracting patches from Whole Slide Images (WSIs).

This class provides an efficient way to iterate over patches from a WSI, with support for: - Automatic scaling between different magnifications/pixel sizes - Tissue mask filtering to exclude background regions - Overlap control between adjacent patches - Custom coordinate specification - Both coordinate-only and full patch extraction modes

The patcher automatically handles the complex calculations needed to extract patches at the desired magnification while respecting tissue boundaries and overlap requirements.

wsi

The WSI object to extract patches from.

Type:

WSI

patch_size_target

Target patch size in pixels after rescaling.

Type:

int

overlap

Overlap between patches in pixels.

Type:

int

width

Width of the WSI at the source magnification.

Type:

int

height

Height of the WSI at the source magnification.

Type:

int

mask

Tissue mask for filtering patches.

Type:

Optional[gpd.GeoDataFrame]

coords_only

Whether to return only coordinates or full patches.

Type:

bool

pil

Whether to return patches as PIL Images or numpy arrays.

Type:

bool

Example

Basic patch extraction:

>>> patcher = WSIPatcher(wsi, patch_size=512, dst_mag=20)
>>> for patch, coords in patcher:
...     process_patch(patch, coords)

Extract only coordinates:

>>> patcher = WSIPatcher(wsi, patch_size=512, dst_mag=20, coords_only=True)
>>> coordinates = list(patcher)

Use with tissue mask:

>>> patcher = WSIPatcher(wsi, patch_size=512, dst_mag=20, mask=tissue_mask)
>>> for patch, coords in patcher:
...     # Only tissue patches are returned
...     process_patch(patch, coords)

Initialize patcher, compute number of (masked) rows, columns.

Parameters:
  • wsi (WSI) – WSI to patch.

  • patch_size (int) – Patch width/height in pixel on the slide after rescaling.

  • src_pixel_size (float, optional) – Pixel size in um/px of the slide before rescaling. Defaults to None. Deprecated, this argument will be removed in the next major version and will default to wsi.mpp.

  • dst_pixel_size (float, optional) – Pixel size in um/px of the slide after rescaling. Defaults to None. If both dst_mag and dst_pixel_size are not None, dst_pixel_size is used.

  • src_mag (int, optional) – Level0 magnification of the slide before rescaling. Defaults to None. Deprecated, this argument will be removed in the next major version and will default to wsi.mag.

  • dst_mag (int, optional) – Target magnification of the slide after rescaling. Defaults to None. If both dst_mag and dst_pixel_size are not None, dst_pixel_size is used.

  • overlap (int, optional) – Overlap between patches in pixels. Defaults to 0.

  • mask (gpd.GeoDataFrame, optional) – GeoPandas dataframe of Polygons. Defaults to None.

  • coords_only (bool, optional) – Whether to extract only the coordinates instead of coordinates + tile. Defaults to False.

  • custom_coords (array-like, optional) – Custom coordinates to use. Defaults to None.

  • threshold (float, optional) – Minimum proportion of the patch under tissue to be kept. This argument is ignored if mask is None; passing threshold=0 will be faster. Defaults to 0.0.

  • pil (bool, optional) – Whether to get patches as PIL.Image (numpy array by default). Defaults to False.

  • scan_order ({"row-major", "col-major"}, optional) – Scan order used when generating the default coordinate grid (only when custom_coords is None). - “row-major”: iterate row by row (Y -> X). Typically best for disk locality on tiled WSI formats. - “col-major”: iterate column by column (X -> Y). Legacy behavior.

__init__(wsi: Any, patch_size: int, src_pixel_size: float | None = None, dst_pixel_size: float | None = None, src_mag: int | None = None, dst_mag: int | None = None, overlap: int = 0, mask: GeoDataFrame | None = None, coords_only: bool = False, custom_coords: Any | None = None, threshold: float = 0.0, pil: bool = False, scan_order: Literal['row-major', 'col-major'] = 'row-major')

Initialize patcher, compute number of (masked) rows, columns.

Parameters:
  • wsi (WSI) – WSI to patch.

  • patch_size (int) – Patch width/height in pixel on the slide after rescaling.

  • src_pixel_size (float, optional) – Pixel size in um/px of the slide before rescaling. Defaults to None. Deprecated, this argument will be removed in the next major version and will default to wsi.mpp.

  • dst_pixel_size (float, optional) – Pixel size in um/px of the slide after rescaling. Defaults to None. If both dst_mag and dst_pixel_size are not None, dst_pixel_size is used.

  • src_mag (int, optional) – Level0 magnification of the slide before rescaling. Defaults to None. Deprecated, this argument will be removed in the next major version and will default to wsi.mag.

  • dst_mag (int, optional) – Target magnification of the slide after rescaling. Defaults to None. If both dst_mag and dst_pixel_size are not None, dst_pixel_size is used.

  • overlap (int, optional) – Overlap between patches in pixels. Defaults to 0.

  • mask (gpd.GeoDataFrame, optional) – GeoPandas dataframe of Polygons. Defaults to None.

  • coords_only (bool, optional) – Whether to extract only the coordinates instead of coordinates + tile. Defaults to False.

  • custom_coords (array-like, optional) – Custom coordinates to use. Defaults to None.

  • threshold (float, optional) – Minimum proportion of the patch under tissue to be kept. This argument is ignored if mask is None; passing threshold=0 will be faster. Defaults to 0.0.

  • pil (bool, optional) – Whether to get patches as PIL.Image (numpy array by default). Defaults to False.

  • scan_order ({"row-major", "col-major"}, optional) – Scan order used when generating the default coordinate grid (only when custom_coords is None). - “row-major”: iterate row by row (Y -> X). Typically best for disk locality on tiled WSI formats. - “col-major”: iterate column by column (X -> Y). Legacy behavior.

classmethod from_legacy_coords(wsi, patch_size, patch_level, custom_downsample, coords, coords_only=False, pil=False) WSIPatcher

Create a WSIPatcher from legacy coordinates parameters generated with CLAM or Fishing-Rod. These legacy coordinates parameters include: custom_downsample and patch_level instead of the new patch_size and dst_mag/dst_mpp format.

Parameters:
  • wsi (WSI) – WSI to patch.

  • patch_size (int) – The target patch size at the desired magnification.

  • patch_level (int) – The patch level used when reading the slide.

  • custom_downsample (int) – Any additional downsampling applied to the patches.

  • coords (np.array) – An array of patch coordinates.

  • coords_only (bool, optional) – Whether to extract only coordinates. Defaults to False.

  • pil (bool, optional) – Whether to get patches as PIL.Image. Defaults to False.

Returns:

WSIPatcher created from the given legacy coordinates.

Return type:

WSIPatcher

classmethod from_legacy_coords_file(wsi, coords_path, coords_only=False, pil=False) WSIPatcher

Create a WSIPatcher from a legacy coordinates file generated with CLAM or Fishing-Rod.

Parameters:
  • wsi (WSI) – WSI to patch.

  • coords_path (str) – Path to legacy coordinates stored as .h5.

  • coords_only (bool, optional) – Whether the legacy coordinates file only contains coordinates or if it also contains images. Defaults to False.

  • pil (bool, optional) – PIL argument passed to the WSIPatcher constructor. Defaults to False.

Returns:

WSIPatcher created from the given legacy coordinates.

Return type:

WSIPatcher

get_cols_rows() Tuple[int, int]

Get the number of columns and rows in the associated WSI.

Returns:

(nb_columns, nb_rows).

Return type:

Tuple[int, int]

get_tile(col: int, row: int) Tuple[ndarray, int, int]

Get tile at position (column, row).

Parameters:
  • col (int) – Column.

  • row (int) – Row.

Returns:

(tile, pixel x of top-left corner (before rescaling), pixel_y of top-left corner (before rescaling)).

Return type:

Tuple[np.ndarray, int, int]

get_tile_xy(x: int, y: int) Tuple[ndarray, int, int]
visualize() Image

Overlay patch coordinates computed by the WSIPatcher onto a scaled thumbnail of the WSI. It creates a visualization of the patcher coordinates and returns it as an image.

Returns:

Patch visualization.

Return type:

Image.Image

Example

>>> img = wsi_patcher.visualize()
>>> img.save('test_vis.jpg')
class trident.WSIPatcherDataset(patcher, transform)

Bases: Dataset

Dataset from a WSI patcher to directly read tiles on a slide

trident.deprecated(func)

This is a decorator which can be used to mark functions as deprecated. It will result in a warning being emitted when the function is used.

trident.load_wsi(slide_path: str, reader_type: Literal['openslide', 'image', 'cucim', 'sdpc', 'omezarr', 'czi'] | None = None, lazy_init: bool = False, **kwargs) OpenSlideWSI | ImageWSI | CuCIMWSI | SDPCWSI | OMEZarrWSI | CZIWSI

Load a whole-slide image (WSI) using the appropriate backend.

By default, uses OpenSlideWSI for OpenSlide-supported file extensions, and ImageWSI for others. Users may override this behavior by explicitly specifying a reader using the reader_type argument.

Parameters:
  • slide_path (str) – Path to the whole-slide image.

  • reader_type ({'openslide', 'image', 'cucim', 'sdpc', 'omezarr', 'czi'}, optional) – Manually specify the WSI reader to use. If None (default), selection is automatic based on file extension.

  • lazy_init (bool, optional) – Whether to defer backend initialization. Defaults to False for API convenience: load_wsi(“slide.svs”) returns an initialized slide object by default.

  • **kwargs (dict) – Additional keyword arguments passed to the WSI reader constructor.

Returns:

An instance of the appropriate WSI reader.

Return type:

Union[OpenSlideWSI, ImageWSI, CuCIMWSI, SDPCWSI, OMEZarrWSI, CZIWSI]

Raises:

ValueError – If reader_type is ‘cucim’ but the cucim package is not installed, if reader_type is ‘sdpc’ but the sdpc package is not installed, or if an unknown reader type is specified.

trident.visualize_heatmap(wsi: Any, scores: ndarray, coords: ndarray, patch_size_level0: int, vis_level: int | None = 2, cmap: str = 'coolwarm', normalize: bool = True, num_top_patches_to_save: int = -1, output_dir: str | None = 'output', vis_mag: int | None = None, overlay_only: bool = False, filename: str = 'heatmap.png') str

Generate a heatmap visualization overlayed on a whole slide image (WSI).

Parameters:
  • wsi (WSI) – Whole slide image object.

  • scores (np.ndarray) – Scores associated with each coordinate.

  • coords (np.ndarray) – Coordinates of patches at level 0.

  • patch_size_level0 (int) – Patch size at level 0.

  • vis_level (Optional[int]) – Visualization level.

  • cmap (str) – Colormap to use for the heatmap.

  • normalize (bool) – Whether to normalize the scores.

  • num_top_patches_to_save (int) – Number of high-score patches to save. If set to -1, do not save any. Defaults to -1.

  • output_dir (Optional[str]) – Directory to save heatmap and top-k patches.

  • vis_mag (Optional[int]) – Visualization magnification. This overwrites vis_level.

  • overlay_only (bool) – Whether to save the overlay only. If True, saves the overlay on top of a downscaled version of the WSI. Defaults to False.

  • filename (str) – File will be saved in output_dir/filename.

Returns:

Path to the saved heatmap image.

Return type:

str

Segmentation Models

Semantic segmentation models for tissue vs. background detection and filtering.

class trident.segmentation_models.GrandQCArtifactSegmenter(**build_kwargs)

GrandQCArtifactSegmenter initialization.

__init__(**build_kwargs)

GrandQCArtifactSegmenter initialization.

forward(image: Tensor) Tensor

Custom forward pass.

class trident.segmentation_models.GrandQCSegmenter(**build_kwargs)

GrandQCSegmenter initialization.

__init__(**build_kwargs)

GrandQCSegmenter initialization.

forward(image: Tensor) Tensor

Custom forward pass.

class trident.segmentation_models.HESTSegmenter(**build_kwargs: Dict[str, Any])

HESTSegmenter initialization.

__init__(**build_kwargs: Dict[str, Any])

HESTSegmenter initialization.

forward(image: Tensor) Tensor

Can be overwritten if model requires special forward pass.

class trident.segmentation_models.OtsuSegmenter(**build_kwargs)

Classical image-processing tissue segmenter based on two-pass Otsu thresholding.

Initialize Segmentation model wrapper.

Parameters:
  • freeze (bool, optional) – If True, the model’s parameters are frozen (i.e., not trainable) and the model is set to evaluation mode. Defaults to True.

  • confidence_thresh (float, optional) – Threshold for prediction confidence. Predictions below this threshold may be filtered out or ignored. Default is 0.5. Set to 0.4 to keep more tissue.

  • **build_kwargs (dict) – Additional keyword arguments passed to the internal _build method.

model

The constructed model.

Type:

torch.nn.Module

eval_transforms

Transformations to apply to input data during inference.

Type:

Callable

forward(image: Tensor) Tensor

Can be overwritten if model requires special forward pass.

trident.segmentation_models.apply_otsu_thresholding(tile: ndarray) ndarray

Generate a binary mask by using Otsu thresholding.

Taken from https://github.com/TIO-IKIM/CellViT

trident.segmentation_models.mask_rgb(rgb: ndarray, mask: ndarray) ndarray

Mask an RGB image.

Taken from https://github.com/TIO-IKIM/CellViT

trident.segmentation_models.segmentation_model_factory(model_name: str, confidence_thresh: float = 0.5, freeze: bool = True, **build_kwargs) SegmentationModel

Factory function to build a segmentation model by name.

Patch Encoders

Factory for loading patch-level encoder models.

Patch Encoder

Dim

Args

Link

UNI

1024

--patch_encoder uni_v1 --patch_size 256 --mag 20

MahmoodLab/UNI

UNI2-h

1536

--patch_encoder uni_v2 --patch_size 256 --mag 20

MahmoodLab/UNI2-h

CONCH

512

--patch_encoder conch_v1 --patch_size 512 --mag 20

MahmoodLab/CONCH

CONCHv1.5

768

--patch_encoder conch_v15 --patch_size 512 --mag 20

MahmoodLab/conchv1_5

Virchow

2560

--patch_encoder virchow --patch_size 224 --mag 20

paige-ai/Virchow

Virchow2

2560

--patch_encoder virchow2 --patch_size 224 --mag 20

paige-ai/Virchow2

Phikon

768

--patch_encoder phikon --patch_size 224 --mag 20

owkin/phikon

Phikon-v2

1024

--patch_encoder phikon_v2 --patch_size 224 --mag 20

owkin/phikon-v2

KEEP

768

--patch_encoder keep --patch_size 256 --mag 20

Astaxanthin/KEEP

Prov-Gigapath

1536

--patch_encoder gigapath --patch_size 256 --mag 20

prov-gigapath

H-Optimus-0

1536

--patch_encoder hoptimus0 --patch_size 224 --mag 20

bioptimus/H-optimus-0

H-Optimus-1

1536

--patch_encoder hoptimus1 --patch_size 224 --mag 20

bioptimus/H-optimus-1

H0-mini

768/1536

--patch_encoder h0-mini --patch_size 224 --mag 20

bioptimus/H0-mini

MUSK

1024

--patch_encoder musk --patch_size 384 --mag 20

xiangjx/musk

Midnight-12k

3072

--patch_encoder midnight12k --patch_size 224 --mag 20

kaiko-ai/midnight

OpenMidnight

1536

--patch_encoder openmidnight --patch_size 224 --mag 20

SophontAI/OpenMidnight

GPFM

1024

--patch_encoder gpfm --patch_size 224 --mag 20

majiabo/GPFM

GenBio-PathFM

4608

--patch_encoder genbio-pathfm --patch_size 224 --mag 20

genbio-ai/genbio-pathfm

Gemma 4

768/1152

--patch_encoder {gemma4-e4b, gemma4-26b} --patch_size 224 --mag 20

google/gemma-4-E4B / google/gemma-4-26B-A4B

Kaiko

384/768/1024

--patch_encoder kaiko-vit* --patch_size 256 --mag 20

Kaiko Collection

Lunit

384

--patch_encoder lunit-vits8 --patch_size 224 --mag 20

1aurent/lunit

Hibou

1024

--patch_encoder hibou_l --patch_size 224 --mag 20

histai/hibou-L

CTransPath-CHIEF

768

--patch_encoder ctranspath --patch_size 256 --mag 10

ResNet50

1024

--patch_encoder resnet50 --patch_size 256 --mag 20

class trident.patch_encoder_models.CTransPathInferenceEncoder(**build_kwargs)

CTransPath initialization.

__init__(**build_kwargs)

CTransPath initialization.

class trident.patch_encoder_models.Conchv15InferenceEncoder(**build_kwargs)

CONCHv1.5 initialization.

__init__(**build_kwargs)

CONCHv1.5 initialization.

class trident.patch_encoder_models.Conchv1InferenceEncoder(**build_kwargs)

CONCH initialization.

__init__(**build_kwargs)

CONCH initialization.

forward(x)

Can be overwritten if model requires special forward pass.

class trident.patch_encoder_models.CustomInferenceEncoder(enc_name: str, model: Module, transforms: Callable, precision: dtype)

Initialize a CustomInferenceEncoder from user-defined components.

This class is used when the model, transforms, and precision are pre-instantiated externally and should be injected directly into the encoder wrapper.

Parameters:
  • enc_name (str) – A unique name or identifier for the encoder (used for registry or logging).

  • model (torch.nn.Module) – A PyTorch model instance to use for inference.

  • transforms (Callable) – A callable (e.g., torchvision or timm transform) to preprocess input images for evaluation.

  • precision (torch.dtype) – The precision to use for inference (e.g., torch.float32, torch.float16).

__init__(enc_name: str, model: Module, transforms: Callable, precision: dtype)

Initialize a CustomInferenceEncoder from user-defined components.

This class is used when the model, transforms, and precision are pre-instantiated externally and should be injected directly into the encoder wrapper.

Parameters:
  • enc_name (str) – A unique name or identifier for the encoder (used for registry or logging).

  • model (torch.nn.Module) – A PyTorch model instance to use for inference.

  • transforms (Callable) – A callable (e.g., torchvision or timm transform) to preprocess input images for evaluation.

  • precision (torch.dtype) – The precision to use for inference (e.g., torch.float32, torch.float16).

class trident.patch_encoder_models.GPFMInferenceEncoder(**build_kwargs)

GPFM initialization.

__init__(**build_kwargs)

GPFM initialization.

class trident.patch_encoder_models.Gemma426BInferenceEncoder(**build_kwargs)

Gemma 4 26B-A4B vision tower (hidden=1152).

Initialize BasePatchEncoder.

Parameters:
  • weights_path (Optional[str]) – Optional path to local model weights. If None, the model is loaded from the model registry or downloaded from Hugging Face Hub.

  • **build_kwargs (dict) – Additional keyword arguments passed to the _build() method to customize model creation.

enc_name

Name of the encoder architecture (set during _build()).

Type:

Optional[str]

weights_path

Path to local model weights (if provided).

Type:

Optional[str]

model

The instantiated encoder model.

Type:

nn.Module

eval_transforms

Evaluation-time preprocessing transforms.

Type:

Callable

precision

Precision used for inference.

Type:

torch.dtype

HF_REPO = 'google/gemma-4-26B-A4B'
VARIANT = '26b'
class trident.patch_encoder_models.Gemma4E4BInferenceEncoder(**build_kwargs)

Gemma 4 E4B vision tower (hidden=768).

Initialize BasePatchEncoder.

Parameters:
  • weights_path (Optional[str]) – Optional path to local model weights. If None, the model is loaded from the model registry or downloaded from Hugging Face Hub.

  • **build_kwargs (dict) – Additional keyword arguments passed to the _build() method to customize model creation.

enc_name

Name of the encoder architecture (set during _build()).

Type:

Optional[str]

weights_path

Path to local model weights (if provided).

Type:

Optional[str]

model

The instantiated encoder model.

Type:

nn.Module

eval_transforms

Evaluation-time preprocessing transforms.

Type:

Callable

precision

Precision used for inference.

Type:

torch.dtype

HF_REPO = 'google/gemma-4-E4B'
VARIANT = 'e4b'
class trident.patch_encoder_models.GenBioPathFMInferenceEncoder(**build_kwargs)

GenBio-PathFM initialization.

__init__(**build_kwargs)

GenBio-PathFM initialization.

forward(x)

Can be overwritten if model requires special forward pass.

class trident.patch_encoder_models.GigaPathInferenceEncoder(**build_kwargs)

GigaPath initialization.

__init__(**build_kwargs)

GigaPath initialization.

class trident.patch_encoder_models.H0MiniInferenceEncoder(**build_kwargs)

H0-mini initialization.

__init__(**build_kwargs)

H0-mini initialization.

forward(x)

Can be overwritten if model requires special forward pass.

class trident.patch_encoder_models.HOptimus0InferenceEncoder(**build_kwargs)

H-Optimus0 initialization.

__init__(**build_kwargs)

H-Optimus0 initialization.

class trident.patch_encoder_models.HOptimus1InferenceEncoder(**build_kwargs)

H-Optimus1 initialization.

__init__(**build_kwargs)

H-Optimus1 initialization.

class trident.patch_encoder_models.HibouLInferenceEncoder(**build_kwargs)

Hibou initialization.

__init__(**build_kwargs)

Hibou initialization.

forward(x)

Can be overwritten if model requires special forward pass.

forward_features(x)
class trident.patch_encoder_models.KaikoB16InferenceEncoder(**build_kwargs)

Kaiko Base 16 initialization.

HF_HUB_ID = 'vit_base_patch16_224'
IMG_SIZE = 224
MODEL_NAME = 'vitb16'
__init__(**build_kwargs)

Kaiko Base 16 initialization.

class trident.patch_encoder_models.KaikoB8InferenceEncoder(**build_kwargs)

Kaiko Base 8 initialization.

HF_HUB_ID = 'vit_base_patch8_224'
IMG_SIZE = 224
MODEL_NAME = 'vitb8'
__init__(**build_kwargs)

Kaiko Base 8 initialization.

class trident.patch_encoder_models.KaikoL14InferenceEncoder(**build_kwargs)

Kaiko Large 14 initialization.

HF_HUB_ID = 'vit_large_patch14_reg4_dinov2'
IMG_SIZE = 518
MODEL_NAME = 'vitl14'
__init__(**build_kwargs)

Kaiko Large 14 initialization.

class trident.patch_encoder_models.KaikoS16InferenceEncoder(**build_kwargs)

Kaiko Small 16 initialization.

HF_HUB_ID = 'vit_small_patch16_224'
IMG_SIZE = 224
MODEL_NAME = 'vits16'
__init__(**build_kwargs)

Kaiko Small 16 initialization.

class trident.patch_encoder_models.KaikoS8InferenceEncoder(**build_kwargs)

Kaiko Small 8 initialization.

HF_HUB_ID = 'vit_small_patch8_224'
IMG_SIZE = 224
MODEL_NAME = 'vits8'
__init__(**build_kwargs)

Kaiko Small 8 initialization.

class trident.patch_encoder_models.KeepInferenceEncoder(**build_kwargs)

KEEP initialization.

__init__(**build_kwargs)

KEEP initialization.

forward(x)

Can be overwritten if model requires special forward pass.

class trident.patch_encoder_models.LunitS8InferenceEncoder(**build_kwargs)

Lunit initialization.

__init__(**build_kwargs)

Lunit initialization.

class trident.patch_encoder_models.Midnight12kInferenceEncoder(**build_kwargs)

Midnight 12-k initialization by Kaiko.

__init__(**build_kwargs)

Midnight 12-k initialization by Kaiko.

forward(x)

Can be overwritten if model requires special forward pass.

class trident.patch_encoder_models.MuskInferenceEncoder(**build_kwargs)

MUSK initialization.

__init__(**build_kwargs)

MUSK initialization.

forward(x)

Can be overwritten if model requires special forward pass.

class trident.patch_encoder_models.OpenMidnightInferenceEncoder(**build_kwargs)

OpenMidnight initialization.

__init__(**build_kwargs)

OpenMidnight initialization.

forward(x)

Can be overwritten if model requires special forward pass.

class trident.patch_encoder_models.PhikonInferenceEncoder(**build_kwargs)

Phikon initialization.

__init__(**build_kwargs)

Phikon initialization.

forward(x)

Can be overwritten if model requires special forward pass.

forward_features(x)
class trident.patch_encoder_models.Phikonv2InferenceEncoder(**build_kwargs)

Phikonv2 initialization.

__init__(**build_kwargs)

Phikonv2 initialization.

forward(x)

Can be overwritten if model requires special forward pass.

class trident.patch_encoder_models.ResNet50InferenceEncoder(**build_kwargs)

ResNet50-ImageNet initialization.

__init__(**build_kwargs)

ResNet50-ImageNet initialization.

forward(x)

Can be overwritten if model requires special forward pass.

forward_features(x)
class trident.patch_encoder_models.UNIInferenceEncoder(**build_kwargs)

UNI initialization.

__init__(**build_kwargs)

UNI initialization.

class trident.patch_encoder_models.UNIv2InferenceEncoder(**build_kwargs)

UNIv2 initialization.

__init__(**build_kwargs)

UNIv2 initialization.

class trident.patch_encoder_models.Virchow2InferenceEncoder(**build_kwargs)

Virchow 2 initialization.

__init__(**build_kwargs)

Virchow 2 initialization.

forward(x)

Can be overwritten if model requires special forward pass.

timm = <module 'timm' from '/home/docs/checkouts/readthedocs.org/user_builds/trident-docs/envs/latest/lib/python3.10/site-packages/timm/__init__.py'>
class trident.patch_encoder_models.VirchowInferenceEncoder(**build_kwargs)

Virchow initialization.

__init__(**build_kwargs)

Virchow initialization.

forward(x)

Can be overwritten if model requires special forward pass.

timm = <module 'timm' from '/home/docs/checkouts/readthedocs.org/user_builds/trident-docs/envs/latest/lib/python3.10/site-packages/timm/__init__.py'>
trident.patch_encoder_models.encoder_factory(model_name: str, **kwargs) Module

Instantiate a patch encoder model by name.

This factory function returns a pre-configured encoder model class based on the provided model_name. Each encoder is designed for extracting representations from image patches using specific backbones or pretraining strategies.

Parameters:
  • model_name (str) – Name of the encoder to instantiate. Must be one of the following:

  • "conch_v1" (-)

  • "conch_v15" (-)

  • "uni_v1" (-)

  • "uni_v2" (-)

  • "ctranspath" (-)

  • "phikon" (-)

  • "phikon_v2" (-)

  • "resnet50" (-)

  • "keep" (-)

  • "gigapath" (-)

  • "virchow" (-)

  • "virchow2" (-)

  • "hoptimus0" (-)

  • "hoptimus1" (-)

  • "h0-mini" (-)

  • "musk" (-)

  • "openmidnight" (-)

  • "gpfm" (-)

  • "hibou_l" (-)

  • "kaiko-vitb8" (-)

  • "kaiko-vitb16" (-)

  • "kaiko-vits8" (-)

  • "kaiko-vits16" (-)

  • "kaiko-vitl14" (-)

  • "lunit-vits8" (-)

  • "genbio-pathfm" (-)

  • "gemma4-e4b" (-)

  • "gemma4-26b" (-)

  • **kwargs (dict) – Optional keyword arguments passed directly to the encoder constructor. These may include parameters such as:

  • weights_path (-) – Path to a local checkpoint (optional)

  • normalize (-) – Whether to normalize output embeddings (default: False)

  • with_proj (-) – Whether to apply the projection head (default: True)

  • parameters (- any model-specific configuration)

Returns:

An instance of the specified encoder model.

Return type:

torch.nn.Module

Raises:

ValueError – If model_name is not among the recognized encoder names.

Example

>>> # Load a high-performance vision transformer
>>> encoder = encoder_factory("conch_v15")
>>>
>>> # Load with custom weights
>>> encoder = encoder_factory("uni_v2", weights_path="custom_weights.pth")
>>>
>>> # Load a fast CNN model
>>> encoder = encoder_factory("ctranspath")

Slide Encoders

Factory for slide-level encoder models.

Slide Encoder

Patch Encoder

Args

Link

Threads

conch_v15

--slide_encoder threads --patch_size 512 --mag 20

(Coming Soon!)

Titan

conch_v15

--slide_encoder titan --patch_size 512 --mag 20

MahmoodLab/TITAN

PRISM

virchow

--slide_encoder prism --patch_size 224 --mag 20

paige-ai/Prism

CHIEF

ctranspath

--slide_encoder chief --patch_size 256 --mag 10

CHIEF

GigaPath

gigapath

--slide_encoder gigapath --patch_size 256 --mag 20

prov-gigapath

Madeleine

conch_v1

--slide_encoder madeleine --patch_size 256 --mag 10

MahmoodLab/madeleine

Feather

conch_v15

--slide_encoder feather --patch_size 512 --mag 20

MahmoodLab/feather

class trident.slide_encoder_models.ABMILSlideEncoder(**build_kwargs: Dict[str, Any])

ABMIL initialization.

__init__(**build_kwargs: Dict[str, Any])

ABMIL initialization.

forward(batch, device='cuda', return_raw_attention=False)

Can be overwritten if model requires special forward pass.

class trident.slide_encoder_models.CHIEFSlideEncoder(**build_kwargs)

CHIEF initialization.

__init__(**build_kwargs)

CHIEF initialization.

forward(batch, device='cuda')

Can be overwritten if model requires special forward pass.

class trident.slide_encoder_models.FeatherSlideEncoder(**build_kwargs)

Feather initialization.

__init__(**build_kwargs)

Feather initialization.

forward(batch, device='cuda')

Can be overwritten if model requires special forward pass.

class trident.slide_encoder_models.FeatherUni2SlideEncoder(**build_kwargs)

FeatherUni2SlideEncoder initialization.

__init__(**build_kwargs)

FeatherUni2SlideEncoder initialization.

forward(batch, device='cuda')

Can be overwritten if model requires special forward pass.

class trident.slide_encoder_models.GigaPathSlideEncoder(**build_kwargs)

GigaPath initialization.

__init__(**build_kwargs)

GigaPath initialization.

forward(batch, device='cuda')

Can be overwritten if model requires special forward pass.

class trident.slide_encoder_models.MadeleineSlideEncoder(**build_kwargs)

Madeleine initialization.

__init__(**build_kwargs)

Madeleine initialization.

forward(x, device='cuda')

Can be overwritten if model requires special forward pass.

class trident.slide_encoder_models.MeanSlideEncoder(**build_kwargs)

Mean pooling initialization.

__init__(**build_kwargs)

Mean pooling initialization.

forward(batch, device='cuda')

Can be overwritten if model requires special forward pass.

class trident.slide_encoder_models.PRISMSlideEncoder(**build_kwargs)

PRISM initialization.

__init__(**build_kwargs)

PRISM initialization.

forward(batch, device='cuda')

Can be overwritten if model requires special forward pass.

class trident.slide_encoder_models.ThreadsSlideEncoder(**build_kwargs)

Threads initialization.

__init__(**build_kwargs)

Threads initialization.

forward(batch, device='cuda', return_raw_attention=False)

Can be overwritten if model requires special forward pass.

class trident.slide_encoder_models.TitanSlideEncoder(**build_kwargs)

Titan initialization.

__init__(**build_kwargs)

Titan initialization.

forward(batch, device='cuda')

Can be overwritten if model requires special forward pass.

trident.slide_encoder_models.encoder_factory(model_name: str, pretrained: bool = True, freeze: bool = True, **kwargs: Dict[str, Any]) Module

Build a slide encoder model.

Parameters:
  • model_name (str) – Name of the model to build.

  • pretrained (bool) – Whether to load pretrained weights.

  • freeze (bool) – Whether to freeze the weights of the model.

  • **kwargs (dict) – Additional arguments to pass to the model constructor.

Returns:

The slide encoder model.

Return type:

torch.nn.Module