albumentations.augmentations.crops.base


Base crop transform classes and shared crop schemas.

CropSizeErrorclass

CropSizeError()

Raised when requested crop dimensions are incompatible with image size, required padding, or generated crop coordinate constraints. Used by crop transforms to fail early before generating invalid crop coordinates.

BaseCropclass

BaseCrop()

Abstract base for crop-only transforms. Subclasses return crop_coords from get_params_dependent_on_data. All targets cropped consistently. This abstract class provides the foundation for all cropping transformations. It handles cropping of different data types including images, masks, bounding boxes, keypoints, and volumes while keeping their spatial relationships intact. Child classes must implement the `get_params_dependent_on_data` method to determine crop coordinates based on transform-specific logic. This method should return a dictionary containing at least a 'crop_coords' key with a tuple value (x_min, y_min, x_max, y_max). Args: p (float): Probability of applying the transform. Default: 1.0. Targets: image, mask, bboxes, keypoints, volume, mask3d Image types: uint8, float32 Note: This class is not meant to be used directly. Instead, use or create derived transforms that implement the specific cropping behavior required. Examples: >>> import numpy as np >>> import albumentations as A >>> from albumentations.augmentations.crops.transforms import BaseCrop >>> >>> # Example of a custom crop transform that inherits from BaseCrop >>> class CustomCenterCrop(BaseCrop): ... '''A simple custom center crop with configurable size''' ... def __init__(self, crop_height, crop_width, p=1.0): ... super().__init__(p=p) ... self.crop_height = crop_height ... self.crop_width = crop_width ... ... def get_params_dependent_on_data(self, params, data): ... '''Calculate crop coordinates based on center of image''' ... image_height, image_width = params["shape"][:2] ... ... # Calculate center crop coordinates ... x_min = max(0, (image_width - self.crop_width) // 2) ... y_min = max(0, (image_height - self.crop_height) // 2) ... x_max = min(image_width, x_min + self.crop_width) ... y_max = min(image_height, y_min + self.crop_height) ... ... return {"crop_coords": (x_min, y_min, x_max, y_max)} >>> >>> # Prepare sample data >>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) >>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8) >>> bboxes = np.array([[10, 10, 50, 50], [40, 40, 80, 80]], dtype=np.float32) >>> bbox_labels = [1, 2] >>> keypoints = np.array([[20, 30], [60, 70]], dtype=np.float32) >>> keypoint_labels = [0, 1] >>> >>> # Use the custom transform in a pipeline >>> transform = A.Compose( ... [CustomCenterCrop(crop_height=80, crop_width=80)], ... bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']), ... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']) ... ) >>> >>> # Apply the transform to data >>> result = transform( ... image=image, ... mask=mask, ... bboxes=bboxes, ... bbox_labels=bbox_labels, ... keypoints=keypoints, ... keypoint_labels=keypoint_labels ... ) >>> >>> # Get the transformed data >>> transformed_image = result['image'] # Will be 80x80 >>> transformed_mask = result['mask'] # Will be 80x80 >>> transformed_bboxes = result['bboxes'] # Bounding boxes adjusted to the cropped area >>> transformed_bbox_labels = result['bbox_labels'] # Labels for bboxes that remain after cropping >>> transformed_keypoints = result['keypoints'] # Keypoints adjusted to the cropped area >>> transformed_keypoint_labels = result['keypoint_labels'] # Labels for keypoints that remain after cropping

BaseCropAndPadclass

BaseCropAndPad(
    pad_if_needed: bool,
    border_mode: Literal[cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101],
    fill: tuple[float, ...] | float,
    fill_mask: tuple[float, ...] | float,
    pad_position: Literal['center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random'],
    p: float
)

Abstract base for crop+pad transforms (e.g. fixed size). Adds pad_if_needed, border_mode, fill, pad_position to BaseCrop. Subclasses define crop and pad logic. This abstract class extends BaseCrop by adding padding capabilities. It's the foundation for transforms that may need to both crop parts of the input and add padding, such as when converting inputs to a specific target size. The class handles the complexities of applying these operations to different data types (images, masks, bounding boxes, keypoints) while maintaining their spatial relationships. Child classes must implement the `get_params_dependent_on_data` method to determine crop coordinates and padding parameters based on transform-specific logic. Args: pad_if_needed (bool): Whether to pad the input if the crop size exceeds input dimensions. border_mode (int): OpenCV border mode used for padding. fill (tuple[float, ...] | float): Value to fill the padded area if border_mode is BORDER_CONSTANT. For multi-channel images, this can be a tuple with a value for each channel. fill_mask (tuple[float, ...] | float): Value to fill the padded area in masks. pad_position (Literal['center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random']): Position of padding when pad_if_needed is True. p (float): Probability of applying the transform. Default: 1.0. Targets: image, mask, bboxes, keypoints, volume, mask3d Image types: uint8, float32 Note: This class is not meant to be used directly. Instead, use or create derived transforms that implement the specific cropping and padding behavior required. Examples: >>> import numpy as np >>> import cv2 >>> import albumentations as A >>> from albumentations.augmentations.crops.transforms import BaseCropAndPad >>> >>> # Example of a custom transform that inherits from BaseCropAndPad >>> # This transform crops to a fixed size, padding if needed to maintain dimensions >>> class CustomFixedSizeCrop(BaseCropAndPad): ... '''A custom fixed-size crop that pads if needed to maintain output size''' ... def __init__( ... self, ... height=224, ... width=224, ... offset_x=0, # Offset for crop position ... offset_y=0, # Offset for crop position ... pad_if_needed=True, ... border_mode=cv2.BORDER_CONSTANT, ... fill=0, ... fill_mask=0, ... pad_position="center", ... p=1.0, ... ): ... super().__init__( ... pad_if_needed=pad_if_needed, ... border_mode=border_mode, ... fill=fill, ... fill_mask=fill_mask, ... pad_position=pad_position, ... p=p, ... ) ... self.height = height ... self.width = width ... self.offset_x = offset_x ... self.offset_y = offset_y ... ... def get_params_dependent_on_data(self, params, data): ... '''Calculate crop coordinates and padding if needed''' ... image_shape = params["shape"][:2] ... image_height, image_width = image_shape ... ... # Calculate crop coordinates with offsets ... x_min = self.offset_x ... y_min = self.offset_y ... x_max = min(x_min + self.width, image_width) ... y_max = min(y_min + self.height, image_height) ... ... # Get padding params if needed ... pad_params = self._get_pad_params( ... image_shape, ... (self.height, self.width) ... ) if self.pad_if_needed else None ... ... return { ... "crop_coords": (x_min, y_min, x_max, y_max), ... "pad_params": pad_params, ... } >>> >>> # Prepare sample data >>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) >>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8) >>> bboxes = np.array([[10, 10, 50, 50], [40, 40, 80, 80]], dtype=np.float32) >>> bbox_labels = [1, 2] >>> keypoints = np.array([[20, 30], [60, 70]], dtype=np.float32) >>> keypoint_labels = [0, 1] >>> >>> # Use the custom transform in a pipeline >>> # This will create a 224x224 crop with padding as needed >>> transform = A.Compose( ... [CustomFixedSizeCrop( ... height=224, ... width=224, ... offset_x=20, ... offset_y=10, ... fill=127, # Gray color for padding ... fill_mask=0 ... )], ... bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']), ... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels'])) >>> >>> # Apply the transform to data >>> result = transform( ... image=image, ... mask=mask, ... bboxes=bboxes, ... bbox_labels=bbox_labels, ... keypoints=keypoints, ... keypoint_labels=keypoint_labels ... ) >>> >>> # Get the transformed data >>> transformed_image = result['image'] # Will be 224x224 with padding >>> transformed_mask = result['mask'] # Will be 224x224 with padding >>> transformed_bboxes = result['bboxes'] # Bounding boxes adjusted to the cropped and padded area >>> transformed_bbox_labels = result['bbox_labels'] # Bounding box labels after crop >>> transformed_keypoints = result['keypoints'] # Keypoints adjusted to the cropped and padded area >>> transformed_keypoint_labels = result['keypoint_labels'] # Keypoint labels after crop

Parameters

NameTypeDefaultDescription
pad_if_neededbool--
border_mode
One of:
  • cv2.BORDER_CONSTANT
  • cv2.BORDER_REPLICATE
  • cv2.BORDER_REFLECT
  • cv2.BORDER_WRAP
  • cv2.BORDER_REFLECT_101
--
fill
One of:
  • tuple[float, ...]
  • float
--
fill_mask
One of:
  • tuple[float, ...]
  • float
--
pad_position
One of:
  • 'center'
  • 'top_left'
  • 'top_right'
  • 'bottom_left'
  • 'bottom_right'
  • 'random'
--
pfloat--

BaseRandomSizedCropInitSchemaclass

BaseRandomSizedCropInitSchema()

Shared validation schema for random sized crop transforms that sample source crop windows before resizing them to the final user-requested output dimensions. Keeps common size validation in one place for both random sized crop variants.