Stay updated

News & Insights

albumentations.augmentations.geometric.resize


Rescale an image so that the longest side is equal to max_size or sides meet max_size_hw constraints, keeping the aspect ratio.

LongestMaxSizeclass

LongestMaxSize(
    max_size: int | Sequence | None,
    max_size_hw: tuple[int | None, int | None] | None,
    interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
    mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
    area_for_downscale: 'image' | 'image_mask' | None,
    p: float = 1
)

Rescale an image so that the longest side is equal to max_size or sides meet max_size_hw constraints, keeping the aspect ratio.

Parameters

NameTypeDefaultDescription
max_size
One of:
  • int
  • Sequence
  • None
-Maximum size of the longest side after the transformation. When using a list or tuple, the max size will be randomly selected from the values provided. Default: None.
max_size_hw
One of:
  • tuple[int | None, int | None]
  • None
-Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must fit within these bounds - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None.
interpolation
One of:
  • 0
  • 6
  • 1
  • 2
  • 3
  • 4
  • 5
1interpolation method. Default: cv2.INTER_LINEAR.
mask_interpolation
One of:
  • 0
  • 6
  • 1
  • 2
  • 3
  • 4
  • 5
0flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.
area_for_downscale
One of:
  • 'image'
  • 'image_mask'
  • None
-Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None.
pfloat1probability of applying the transform. Default: 1.

Examples

>>> import albumentations as A
>>> import cv2
>>> # Using max_size
>>> transform1 = A.LongestMaxSize(max_size=1024, area_for_downscale="image")
>>> # Input image (1500, 800) -> Output (1024, 546)
>>>
>>> # Using max_size_hw with both dimensions
>>> transform2 = A.LongestMaxSize(max_size_hw=(800, 1024), area_for_downscale="image_mask")
>>> # Input (1500, 800) -> Output (800, 427)
>>> # Input (800, 1500) -> Output (546, 1024)
>>>
>>> # Using max_size_hw with only height
>>> transform3 = A.LongestMaxSize(max_size_hw=(800, None))
>>> # Input (1500, 800) -> Output (800, 427)
>>>
>>> # Common use case with padding
>>> transform4 = A.Compose([
...     A.LongestMaxSize(max_size=1024, area_for_downscale="image"),
...     A.PadIfNeeded(min_height=1024, min_width=1024),
... ])

Notes

- This transform scales images based on their longest side: * If the longest side is **smaller** than max_size: the image will be **upscaled** (scale > 1.0) * If the longest side is **equal** to max_size: the image will **not be resized** (scale = 1.0) * If the longest side is **larger** than max_size: the image will be **downscaled** (scale < 1.0) - This transform will not crop the image. The resulting image may be smaller than specified in both dimensions. - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio. - Bounding boxes and keypoints are scaled accordingly. - When area_for_downscale is set, INTER_AREA will be used for downscaling, providing better quality.

RandomScaleclass

RandomScale(
    scale_limit: tuple[float, float] | float = (-0.1, 0.1),
    interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
    mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
    area_for_downscale: 'image' | 'image_mask' | None,
    p: float = 0.5
)

Randomly resize the input. Output image size is different from the input image size.

Parameters

NameTypeDefaultDescription
scale_limit
One of:
  • tuple[float, float]
  • float
(-0.1, 0.1)scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1).
interpolation
One of:
  • 0
  • 6
  • 1
  • 2
  • 3
  • 4
  • 5
1flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
mask_interpolation
One of:
  • 0
  • 6
  • 1
  • 2
  • 3
  • 4
  • 5
0flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.
area_for_downscale
One of:
  • 'image'
  • 'image_mask'
  • None
-Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None.
pfloat0.5probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Create sample data for demonstration
>>> image = np.zeros((100, 100, 3), dtype=np.uint8)
>>> # Add some shapes to visualize scaling effects
>>> cv2.rectangle(image, (25, 25), (75, 75), (255, 0, 0), -1)  # Red square
>>> cv2.circle(image, (50, 50), 10, (0, 255, 0), -1)  # Green circle
>>>
>>> # Create a mask for segmentation
>>> mask = np.zeros((100, 100), dtype=np.uint8)
>>> mask[25:75, 25:75] = 1  # Mask covering the red square
>>>
>>> # Create bounding boxes and keypoints
>>> bboxes = np.array([[25, 25, 75, 75]])  # Box around the red square
>>> bbox_labels = [1]
>>> keypoints = np.array([[50, 50]])  # Center of circle
>>> keypoint_labels = [0]
>>>
>>> # Apply RandomScale transform with comprehensive parameters
>>> transform = A.Compose([
...     A.RandomScale(
...         scale_limit=(-0.3, 0.5),     # Scale between 0.7x and 1.5x
...         interpolation=cv2.INTER_LINEAR,
...         mask_interpolation=cv2.INTER_NEAREST,
...         area_for_downscale="image",  # Use INTER_AREA for image downscaling
...         p=1.0                         # Always apply
...     )
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform to all targets
>>> result = transform(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed results
>>> scaled_image = result['image']        # Dimensions will be between 70-150 pixels
>>> scaled_mask = result['mask']          # Mask scaled proportionally to image
>>> scaled_bboxes = result['bboxes']      # Bounding boxes adjusted to new dimensions
>>> scaled_bbox_labels = result['bbox_labels']  # Labels remain unchanged
>>> scaled_keypoints = result['keypoints']      # Keypoints adjusted to new dimensions
>>> scaled_keypoint_labels = result['keypoint_labels']  # Labels remain unchanged
>>>
>>> # The image dimensions will vary based on the randomly sampled scale factor
>>> # With scale_limit=(-0.3, 0.5), dimensions could be anywhere from 70% to 150% of original

Notes

- The output image size is different from the input image size. - Scale factor is sampled independently per image side (width and height). - Bounding box coordinates are scaled accordingly. - Keypoint coordinates are scaled accordingly. - When area_for_downscale is set, INTER_AREA interpolation will be used automatically for downscaling (scale < 1.0), which provides better quality for size reduction.

Resizeclass

Resize(
    height: int,
    width: int,
    interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
    mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
    area_for_downscale: 'image' | 'image_mask' | None,
    p: float = 1
)

Resize the input to the given height and width.

Parameters

NameTypeDefaultDescription
heightint-desired height of the output.
widthint-desired width of the output.
interpolation
One of:
  • 0
  • 6
  • 1
  • 2
  • 3
  • 4
  • 5
1flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
mask_interpolation
One of:
  • 0
  • 6
  • 1
  • 2
  • 3
  • 4
  • 5
0flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.
area_for_downscale
One of:
  • 'image'
  • 'image_mask'
  • None
-Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None.
pfloat1probability of applying the transform. Default: 1.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Create sample data for demonstration
>>> image = np.zeros((100, 100, 3), dtype=np.uint8)
>>> # Add some shapes to visualize resize effects
>>> cv2.rectangle(image, (25, 25), (75, 75), (255, 0, 0), -1)  # Red square
>>> cv2.circle(image, (50, 50), 10, (0, 255, 0), -1)  # Green circle
>>>
>>> # Create a mask for segmentation
>>> mask = np.zeros((100, 100), dtype=np.uint8)
>>> mask[25:75, 25:75] = 1  # Mask covering the red square
>>>
>>> # Create bounding boxes and keypoints
>>> bboxes = np.array([[25, 25, 75, 75]])  # Box around the red square
>>> bbox_labels = [1]
>>> keypoints = np.array([[50, 50]])  # Center of circle
>>> keypoint_labels = [0]
>>>
>>> # Resize all data to 224x224 (common input size for many CNNs)
>>> transform = A.Compose([
...     A.Resize(
...         height=224,
...         width=224,
...         interpolation=cv2.INTER_LINEAR,
...         mask_interpolation=cv2.INTER_NEAREST,
...         area_for_downscale="image",  # Use INTER_AREA when downscaling images
...         p=1.0
...     )
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform to all targets
>>> result = transform(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed results
>>> resized_image = result['image']        # Shape will be (224, 224, 3)
>>> resized_mask = result['mask']          # Shape will be (224, 224)
>>> resized_bboxes = result['bboxes']      # Bounding boxes scaled to new dimensions
>>> resized_bbox_labels = result['bbox_labels']  # Labels remain unchanged
>>> resized_keypoints = result['keypoints']      # Keypoints scaled to new dimensions
>>> resized_keypoint_labels = result['keypoint_labels']  # Labels remain unchanged
>>>
>>> # Note: When resizing from 100x100 to 224x224:
>>> # - The red square will be scaled from (25-75) to approximately (56-168)
>>> # - The keypoint at (50, 50) will move to approximately (112, 112)
>>> # - All spatial relationships are preserved but coordinates are scaled

SmallestMaxSizeclass

SmallestMaxSize(
    max_size: int | Sequence | None,
    max_size_hw: tuple[int | None, int | None] | None,
    interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
    mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
    area_for_downscale: 'image' | 'image_mask' | None,
    p: float = 1
)

Rescale an image so that minimum side is equal to max_size or sides meet max_size_hw constraints, keeping the aspect ratio.

Parameters

NameTypeDefaultDescription
max_size
One of:
  • int
  • Sequence
  • None
-Maximum size of smallest side of the image after the transformation. When using a list, max size will be randomly selected from the values in the list. Default: None.
max_size_hw
One of:
  • tuple[int | None, int | None]
  • None
-Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must be at least these values - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None.
interpolation
One of:
  • 0
  • 6
  • 1
  • 2
  • 3
  • 4
  • 5
1Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
mask_interpolation
One of:
  • 0
  • 6
  • 1
  • 2
  • 3
  • 4
  • 5
0flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.
area_for_downscale
One of:
  • 'image'
  • 'image_mask'
  • None
-Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None.
pfloat1Probability of applying the transform. Default: 1.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> # Using max_size
>>> transform1 = A.SmallestMaxSize(max_size=120, area_for_downscale="image")
>>> # Input image (100, 150) -> Output (120, 180)
>>>
>>> # Using max_size_hw with both dimensions
>>> transform2 = A.SmallestMaxSize(max_size_hw=(100, 200), area_for_downscale="image_mask")
>>> # Input (80, 160) -> Output (100, 200)
>>> # Input (160, 80) -> Output (400, 200)
>>>
>>> # Using max_size_hw with only height
>>> transform3 = A.SmallestMaxSize(max_size_hw=(100, None))
>>> # Input (80, 160) -> Output (100, 200)

Notes

- This transform scales images based on their smallest side: * If the smallest side is **smaller** than max_size: the image will be **upscaled** (scale > 1.0) * If the smallest side is **equal** to max_size: the image will **not be resized** (scale = 1.0) * If the smallest side is **larger** than max_size: the image will be **downscaled** (scale < 1.0) - This transform will not crop the image. The resulting image may be larger than specified in both dimensions. - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio. - Bounding boxes and keypoints are scaled accordingly. - When area_for_downscale is set, INTER_AREA will be used for downscaling, providing better quality.