albumentations.augmentations.crops.special
Specialized crop transforms.
Members
CropNonEmptyMaskIfExistsclass
CropNonEmptyMaskIfExists(
height: int,
width: int,
ignore_values: list[int] | None,
ignore_channels: list[int] | None,
p: float = 1.0
)Crop a region containing non-empty mask pixels; if mask empty or missing, fall back to random crop. Good for segmentation to focus on labeled regions. This transform attempts to crop a region containing a mask (non-zero pixels). If the mask is empty or not provided, it falls back to a random crop. This is particularly useful for segmentation tasks where you want to focus on regions of interest defined by the mask. Args: height (int): Vertical size of crop in pixels. Must be > 0. width (int): Horizontal size of crop in pixels. Must be > 0. ignore_values (list of int, optional): Values to ignore in mask, `0` values are always ignored. For example, if background value is 5, set `ignore_values=[5]` to ignore it. Default: None. ignore_channels (list of int, optional): Channels to ignore in mask. For example, if background is the first channel, set `ignore_channels=[0]` to ignore it. Default: None. p (float): Probability of applying the transform. Default: 1.0. Targets: image, mask, bboxes, keypoints, volume, mask3d Image types: uint8, float32 Supported bboxes: hbb, obb Note: - If a mask is provided, the transform will try to crop an area containing non-zero (or non-ignored) pixels. - If no suitable area is found in the mask or no mask is provided, it will perform a random crop. - The crop size (height, width) must not exceed the original image dimensions. - Bounding boxes and keypoints are also cropped along with the image and mask. Raises: ValueError: If the specified crop size is larger than the input image dimensions. Example: >>> import numpy as np >>> import albumentations as A >>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) >>> mask = np.zeros((100, 100), dtype=np.uint8) >>> mask[25:75, 25:75] = 1 # Create a non-empty region in the mask >>> transform = A.Compose([ ... A.CropNonEmptyMaskIfExists(height=50, width=50, p=1.0), ... ]) >>> transformed = transform(image=image, mask=mask) >>> transformed_image = transformed['image'] >>> transformed_mask = transformed['mask'] # The resulting crop will likely include part of the non-zero region in the mask Raises: ValueError: If the specified crop size is larger than the input image dimensions. Examples: >>> import numpy as np >>> import albumentations as A >>> >>> # Prepare sample data >>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) >>> # Create a mask with non-empty region in the center >>> mask = np.zeros((100, 100), dtype=np.uint8) >>> mask[25:75, 25:75] = 1 # Create a non-empty region in the mask >>> >>> # Create bounding boxes and keypoints in the mask region >>> bboxes = np.array([ ... [20, 20, 60, 60], # Box overlapping with non-empty region ... [30, 30, 70, 70], # Box mostly inside non-empty region ... ], dtype=np.float32) >>> bbox_labels = ['cat', 'dog'] >>> >>> # Add some keypoints inside mask region >>> keypoints = np.array([ ... [40, 40], # Inside non-empty region ... [60, 60], # At edge of non-empty region ... [90, 90] # Outside non-empty region ... ], dtype=np.float32) >>> keypoint_labels = ['eye', 'nose', 'ear'] >>> >>> # Define transform that will crop around the non-empty mask region >>> transform = A.Compose([ ... A.CropNonEmptyMaskIfExists( ... height=50, ... width=50, ... ignore_values=None, ... ignore_channels=None, ... p=1.0 ... ), ... ], bbox_params=A.BboxParams( ... format='pascal_voc', ... label_fields=['bbox_labels'] ... ), keypoint_params=A.KeypointParams( ... format='xy', ... label_fields=['keypoint_labels'] ... )) >>> >>> # Apply the transform >>> transformed = transform( ... image=image, ... mask=mask, ... bboxes=bboxes, ... bbox_labels=bbox_labels, ... keypoints=keypoints, ... keypoint_labels=keypoint_labels ... ) >>> >>> # Get the transformed data >>> transformed_image = transformed['image'] # 50x50 image centered on mask region >>> transformed_mask = transformed['mask'] # 50x50 mask showing part of non-empty region >>> transformed_bboxes = transformed['bboxes'] # Bounding boxes adjusted to new coordinates >>> transformed_bbox_labels = transformed['bbox_labels'] # Labels preserved for visible boxes >>> transformed_keypoints = transformed['keypoints'] # Keypoints adjusted to new coordinates >>> transformed_keypoint_labels = transformed['keypoint_labels'] # Labels for visible keypoints
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| height | int | - | - |
| width | int | - | - |
| ignore_values | One of:
| - | - |
| ignore_channels | One of:
| - | - |
| p | float | 1.0 | - |
RandomCropNearBBoxclass
RandomCropNearBBox(
max_part_shift: tuple[float, float] = (0, 0.3),
cropping_bbox_key: str = cropping_bbox,
p: float = 1.0
)Crop around a reference bbox (cropping_bbox_key) with random shift (max_part_shift). Use when you have a region of interest to augment. Args: max_part_shift (tuple[float, float]): Range (min, max) for shift in `height` and `width` dimensions relative to `cropping_bbox` dimension. Default (0, 0.3). cropping_bbox_key (str): Additional target key for cropping box. Default `cropping_bbox`. p (float): probability of applying the transform. Default: 1. Targets: image, mask, bboxes, keypoints, volume, mask3d Image types: uint8, float32 Supported bboxes: hbb, obb Examples: >>> aug = Compose([RandomCropNearBBox(max_part_shift=(0.1, 0.5), cropping_bbox_key='test_bbox')], >>> bbox_params=BboxParams("pascal_voc")) >>> result = aug(image=image, bboxes=bboxes, test_bbox=[0, 5, 10, 20])
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| max_part_shift | tuple[float, float] | (0, 0.3) | - |
| cropping_bbox_key | str | cropping_bbox | - |
| p | float | 1.0 | - |
RandomCropFromBordersclass
RandomCropFromBorders(
crop_left: float = 0.1,
crop_right: float = 0.1,
crop_top: float = 0.1,
crop_bottom: float = 0.1,
p: float = 1.0
)Randomly remove a strip from each border (crop_left/right/top/bottom). No resize; output smaller. Good for trimming variable borders or slight zoom. This transform randomly crops parts of the input (image, mask, bounding boxes, or keypoints) from each of its borders. The amount of cropping is specified as a fraction of the input's dimensions for each side independently. Args: crop_left (float): The maximum fraction of width to crop from the left side. Must be in the range [0.0, 1.0]. Default: 0.1 crop_right (float): The maximum fraction of width to crop from the right side. Must be in the range [0.0, 1.0]. Default: 0.1 crop_top (float): The maximum fraction of height to crop from the top. Must be in the range [0.0, 1.0]. Default: 0.1 crop_bottom (float): The maximum fraction of height to crop from the bottom. Must be in the range [0.0, 1.0]. Default: 0.1 p (float): Probability of applying the transform. Default: 1.0 Targets: image, mask, bboxes, keypoints, volume, mask3d Image types: uint8, float32 Supported bboxes: hbb, obb Note: - The actual amount of cropping for each side is randomly chosen between 0 and the specified maximum for each application of the transform. - The sum of crop_left and crop_right must not exceed 1.0, and the sum of crop_top and crop_bottom must not exceed 1.0. Otherwise, a ValueError will be raised. - This transform does not resize the input after cropping, so the output dimensions will be smaller than the input dimensions. - Bounding boxes that end up fully outside the cropped area will be removed. - Keypoints that end up outside the cropped area will be removed. Examples: >>> import numpy as np >>> import albumentations as A >>> >>> # Prepare sample data >>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) >>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8) >>> bboxes = np.array([[10, 10, 50, 50], [40, 40, 80, 80]], dtype=np.float32) >>> bbox_labels = [1, 2] >>> keypoints = np.array([[20, 30], [60, 70]], dtype=np.float32) >>> keypoint_labels = [0, 1] >>> >>> # Define transform with crop fractions for each border >>> transform = A.Compose([ ... A.RandomCropFromBorders( ... crop_left=0.1, # Max 10% crop from left ... crop_right=0.2, # Max 20% crop from right ... crop_top=0.15, # Max 15% crop from top ... crop_bottom=0.05, # Max 5% crop from bottom ... p=1.0 ... ), ... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']), ... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels'])) >>> >>> # Apply transform >>> result = transform( ... image=image, ... mask=mask, ... bboxes=bboxes, ... bbox_labels=bbox_labels, ... keypoints=keypoints, ... keypoint_labels=keypoint_labels ... ) >>> >>> # Access transformed data >>> transformed_image = result['image'] # Reduced size image with borders cropped >>> transformed_mask = result['mask'] # Reduced size mask with borders cropped >>> transformed_bboxes = result['bboxes'] # Bounding boxes adjusted to new dimensions >>> transformed_bbox_labels = result['bbox_labels'] # Bounding box labels after crop >>> transformed_keypoints = result['keypoints'] # Keypoints adjusted to new dimensions >>> transformed_keypoint_labels = result['keypoint_labels'] # Keypoint labels after crop >>> >>> # The resulting output shapes will be smaller, with dimensions reduced by >>> # the random crop amounts from each side (within the specified maximums) >>> print(f"Original image shape: (100, 100, 3)") >>> print(f"Transformed image shape: {transformed_image.shape}") # e.g., (85, 75, 3)
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| crop_left | float | 0.1 | - |
| crop_right | float | 0.1 | - |
| crop_top | float | 0.1 | - |
| crop_bottom | float | 0.1 | - |
| p | float | 1.0 | - |