albumentations.augmentations.geometric.resize

View Source on GitHub

Scale image to fit a target canvas preserving aspect ratio, then pad to exact canvas size: YOLO letterbox, equivalent to LongestMaxSize + PadIfNeeded.

Members

classLetterBox
classLongestMaxSize
classRandomScale
classResize
classSmallestMaxSize

LetterBoxclass

LetterBox(
    size: tuple[int, int],
    interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
    mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
    fill: tuple[float, ...] | float = 114,
    fill_mask: tuple[float, ...] | float = 0,
    position: 'center' | 'top_left' | 'top_right' | 'bottom_left' | 'bottom_right' | 'random' = center,
    p: float = 1.0
)

Scale image to fit a target canvas preserving aspect ratio, then pad to exact canvas size: YOLO letterbox, equivalent to LongestMaxSize + PadIfNeeded. The image is downscaled or upscaled so its longest side fits the target, then constant-color padding fills the remaining area. All targets (masks, bboxes, keypoints) are adjusted accordingly.

Parameters

Name	Type	Default	Description
size	tuple[int, int]	-	Target `(height, width)` of the output canvas.
interpolation	One of: 0 6 1 2 3 4 5	1	Interpolation method used when resizing the image. Default: `cv2.INTER_LINEAR`.
mask_interpolation	One of: 0 6 1 2 3 4 5	0	Interpolation method used when resizing masks. Default: `cv2.INTER_NEAREST`.
fill	One of: tuple[float, ...] float	114	Constant pixel value for image padding. Default: `114`.
fill_mask	One of: tuple[float, ...] float	0	Constant pixel value for mask padding. Default: `0`.
position	One of: 'center' 'top_left' 'top_right' 'bottom_left' 'bottom_right' 'random'	center	Where to place the resized image on the canvas. Default: `"center"`.
p	float	1.0	Probability of applying the transform. Default: `1.0`.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>> image = np.random.randint(0, 256, (480, 640, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (480, 640), dtype=np.uint8)
>>> bboxes = np.array([[100, 80, 300, 200]], dtype=np.float32)
>>> bbox_labels = [1]
>>> keypoints = np.array([[200, 150]], dtype=np.float32)
>>> keypoint_labels = [0]
>>>
>>> transform = A.Compose([
...     A.LetterBox(size=(640, 640), fill=114, fill_mask=0, p=1.0)
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> result = transform(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels,
... )
>>> result['image'].shape
(640, 640, 3)

Notes

- The output size is always exactly `(height, width)`. - Images smaller than the target are upscaled; images larger are downscaled. - Bounding boxes and keypoints are adjusted for both the resize and padding steps. - `fill=114` is the YOLO convention for letterbox padding.

LongestMaxSizeclass

LongestMaxSize(
    max_size: int | Sequence | None,
    max_size_hw: tuple[int | None, int | None] | None,
    interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
    mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
    area_for_downscale: 'image' | 'image_mask' | None,
    p: float = 1
)

Rescale an image so that the longest side is equal to max_size or sides meet max_size_hw constraints, keeping the aspect ratio.

Parameters

Name	Type	Default	Description
max_size	One of: int Sequence None	-	Maximum size of the longest side after the transformation. When using a list or tuple, the max size will be randomly selected from the values provided. Default: None.
max_size_hw	One of: tuple[int \| None, int \| None] None	-	Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must fit within these bounds - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None.
interpolation	One of: 0 6 1 2 3 4 5	1	interpolation method. Default: cv2.INTER_LINEAR.
mask_interpolation	One of: 0 6 1 2 3 4 5	0	flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.
area_for_downscale	One of: 'image' 'image_mask' None	-	Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None.
p	float	1	probability of applying the transform. Default: 1.

Examples

>>> import albumentations as A
>>> import cv2
>>> # Using max_size
>>> transform1 = A.LongestMaxSize(max_size=1024, area_for_downscale="image")
>>> # Input image (1500, 800) -> Output (1024, 546)
>>>
>>> # Using max_size_hw with both dimensions
>>> transform2 = A.LongestMaxSize(max_size_hw=(800, 1024), area_for_downscale="image_mask")
>>> # Input (1500, 800) -> Output (800, 427)
>>> # Input (800, 1500) -> Output (546, 1024)
>>>
>>> # Using max_size_hw with only height
>>> transform3 = A.LongestMaxSize(max_size_hw=(800, None))
>>> # Input (1500, 800) -> Output (800, 427)
>>>
>>> # Common use case with padding
>>> transform4 = A.Compose([
...     A.LongestMaxSize(max_size=1024, area_for_downscale="image"),
...     A.PadIfNeeded(min_height=1024, min_width=1024),
... ])

Notes

- This transform scales images based on their longest side: * If the longest side is **smaller** than max_size: the image will be **upscaled** (scale > 1.0) * If the longest side is **equal** to max_size: the image will **not be resized** (scale = 1.0) * If the longest side is **larger** than max_size: the image will be **downscaled** (scale < 1.0) - This transform will not crop the image. The resulting image may be smaller than specified in both dimensions. - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio. - Bounding boxes and keypoints are scaled accordingly. - When area_for_downscale is set, INTER_AREA will be used for downscaling, providing better quality.

RandomScaleclass

RandomScale(
    scale_limit: tuple[float, float] | float = (-0.1, 0.1),
    interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
    mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
    area_for_downscale: 'image' | 'image_mask' | None,
    p: float = 0.5
)

Resize by a random scale factor (scale_limit). Output size differs from input; all targets scaled together. Useful for scale augmentation without cropping.

Parameters

Name	Type	Default	Description
scale_limit	One of: tuple[float, float] float	(-0.1, 0.1)	scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1).
interpolation	One of: 0 6 1 2 3 4 5	1	flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
mask_interpolation	One of: 0 6 1 2 3 4 5	0	flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.
area_for_downscale	One of: 'image' 'image_mask' None	-	Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None.
p	float	0.5	probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Create sample data for demonstration
>>> image = np.zeros((100, 100, 3), dtype=np.uint8)
>>> # Add some shapes to visualize scaling effects
>>> cv2.rectangle(image, (25, 25), (75, 75), (255, 0, 0), -1)  # Red square
>>> cv2.circle(image, (50, 50), 10, (0, 255, 0), -1)  # Green circle
>>>
>>> # Create a mask for segmentation
>>> mask = np.zeros((100, 100), dtype=np.uint8)
>>> mask[25:75, 25:75] = 1  # Mask covering the red square
>>>
>>> # Create bounding boxes and keypoints
>>> bboxes = np.array([[25, 25, 75, 75]])  # Box around the red square
>>> bbox_labels = [1]
>>> keypoints = np.array([[50, 50]])  # Center of circle
>>> keypoint_labels = [0]
>>>
>>> # Apply RandomScale transform with comprehensive parameters
>>> transform = A.Compose([
...     A.RandomScale(
...         scale_limit=(-0.3, 0.5),     # Scale between 0.7x and 1.5x
...         interpolation=cv2.INTER_LINEAR,
...         mask_interpolation=cv2.INTER_NEAREST,
...         area_for_downscale="image",  # Use INTER_AREA for image downscaling
...         p=1.0                         # Always apply
...     )
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform to all targets
>>> result = transform(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed results
>>> scaled_image = result['image']        # Dimensions will be between 70-150 pixels
>>> scaled_mask = result['mask']          # Mask scaled proportionally to image
>>> scaled_bboxes = result['bboxes']      # Bounding boxes adjusted to new dimensions
>>> scaled_bbox_labels = result['bbox_labels']  # Labels remain unchanged
>>> scaled_keypoints = result['keypoints']      # Keypoints adjusted to new dimensions
>>> scaled_keypoint_labels = result['keypoint_labels']  # Labels remain unchanged
>>>
>>> # The image dimensions will vary based on the randomly sampled scale factor
>>> # With scale_limit=(-0.3, 0.5), dimensions could be anywhere from 70% to 150% of original

Notes

- The output image size is different from the input image size. - Scale factor is sampled independently per image side (width and height). - Bounding box coordinates are scaled accordingly. - Keypoint coordinates are scaled accordingly. - When area_for_downscale is set, INTER_AREA interpolation will be used automatically for downscaling (scale < 1.0), which provides better quality for size reduction.

Resizeclass

Resize(
    height: int,
    width: int,
    interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
    mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
    area_for_downscale: 'image' | 'image_mask' | None,
    p: float = 1
)

Resize to given height and width. Params: height, width, interpolation, area_for_downscale. Supports image, mask, bboxes, keypoints.

Parameters

Name	Type	Default	Description
height	int	-	desired height of the output.
width	int	-	desired width of the output.
interpolation	One of: 0 6 1 2 3 4 5	1	flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
mask_interpolation	One of: 0 6 1 2 3 4 5	0	flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.
area_for_downscale	One of: 'image' 'image_mask' None	-	Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None.
p	float	1	probability of applying the transform. Default: 1.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Create sample data for demonstration
>>> image = np.zeros((100, 100, 3), dtype=np.uint8)
>>> # Add some shapes to visualize resize effects
>>> cv2.rectangle(image, (25, 25), (75, 75), (255, 0, 0), -1)  # Red square
>>> cv2.circle(image, (50, 50), 10, (0, 255, 0), -1)  # Green circle
>>>
>>> # Create a mask for segmentation
>>> mask = np.zeros((100, 100), dtype=np.uint8)
>>> mask[25:75, 25:75] = 1  # Mask covering the red square
>>>
>>> # Create bounding boxes and keypoints
>>> bboxes = np.array([[25, 25, 75, 75]])  # Box around the red square
>>> bbox_labels = [1]
>>> keypoints = np.array([[50, 50]])  # Center of circle
>>> keypoint_labels = [0]
>>>
>>> # Resize all data to 224x224 (common input size for many CNNs)
>>> transform = A.Compose([
...     A.Resize(
...         height=224,
...         width=224,
...         interpolation=cv2.INTER_LINEAR,
...         mask_interpolation=cv2.INTER_NEAREST,
...         area_for_downscale="image",  # Use INTER_AREA when downscaling images
...         p=1.0
...     )
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform to all targets
>>> result = transform(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed results
>>> resized_image = result['image']        # Shape will be (224, 224, 3)
>>> resized_mask = result['mask']          # Shape will be (224, 224)
>>> resized_bboxes = result['bboxes']      # Bounding boxes scaled to new dimensions
>>> resized_bbox_labels = result['bbox_labels']  # Labels remain unchanged
>>> resized_keypoints = result['keypoints']      # Keypoints scaled to new dimensions
>>> resized_keypoint_labels = result['keypoint_labels']  # Labels remain unchanged
>>>
>>> # Note: When resizing from 100x100 to 224x224:
>>> # - The red square will be scaled from (25-75) to approximately (56-168)
>>> # - The keypoint at (50, 50) will move to approximately (112, 112)
>>> # - All spatial relationships are preserved but coordinates are scaled

SmallestMaxSizeclass

SmallestMaxSize(
    max_size: int | Sequence | None,
    max_size_hw: tuple[int | None, int | None] | None,
    interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
    mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
    area_for_downscale: 'image' | 'image_mask' | None,
    p: float = 1
)

Rescale an image so that minimum side is equal to max_size or sides meet max_size_hw constraints, keeping the aspect ratio.

Parameters

Name	Type	Default	Description
max_size	One of: int Sequence None	-	Maximum size of smallest side of the image after the transformation. When using a list, max size will be randomly selected from the values in the list. Default: None.
max_size_hw	One of: tuple[int \| None, int \| None] None	-	Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must be at least these values - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None.
interpolation	One of: 0 6 1 2 3 4 5	1	Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
mask_interpolation	One of: 0 6 1 2 3 4 5	0	flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.
area_for_downscale	One of: 'image' 'image_mask' None	-	Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None.
p	float	1	Probability of applying the transform. Default: 1.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> # Using max_size
>>> transform1 = A.SmallestMaxSize(max_size=120, area_for_downscale="image")
>>> # Input image (100, 150) -> Output (120, 180)
>>>
>>> # Using max_size_hw with both dimensions
>>> transform2 = A.SmallestMaxSize(max_size_hw=(100, 200), area_for_downscale="image_mask")
>>> # Input (80, 160) -> Output (100, 200)
>>> # Input (160, 80) -> Output (400, 200)
>>>
>>> # Using max_size_hw with only height
>>> transform3 = A.SmallestMaxSize(max_size_hw=(100, None))
>>> # Input (80, 160) -> Output (100, 200)

Notes

- This transform scales images based on their smallest side: * If the smallest side is **smaller** than max_size: the image will be **upscaled** (scale > 1.0) * If the smallest side is **equal** to max_size: the image will **not be resized** (scale = 1.0) * If the smallest side is **larger** than max_size: the image will be **downscaled** (scale < 1.0) - This transform will not crop the image. The resulting image may be larger than specified in both dimensions. - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio. - Bounding boxes and keypoints are scaled accordingly. - When area_for_downscale is set, INTER_AREA will be used for downscaling, providing better quality.