albumentations.augmentations.geometric.resize
Rescale an image so that the longest side is equal to max_size or sides meet max_size_hw constraints, keeping the aspect ratio.
Members
- classLongestMaxSize
- classRandomScale
- classResize
- classSmallestMaxSize
LongestMaxSizeclass
LongestMaxSize(
max_size: int | Sequence | None,
max_size_hw: tuple[int | None, int | None] | None,
interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
area_for_downscale: 'image' | 'image_mask' | None,
p: float = 1
)Rescale an image so that the longest side is equal to max_size or sides meet max_size_hw constraints, keeping the aspect ratio.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| max_size | One of:
| - | Maximum size of the longest side after the transformation. When using a list or tuple, the max size will be randomly selected from the values provided. Default: None. |
| max_size_hw | One of:
| - | Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must fit within these bounds - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None. |
| interpolation | One of:
| 1 | interpolation method. Default: cv2.INTER_LINEAR. |
| mask_interpolation | One of:
| 0 | flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST. |
| area_for_downscale | One of:
| - | Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None. |
| p | float | 1 | probability of applying the transform. Default: 1. |
Examples
>>> import albumentations as A
>>> import cv2
>>> # Using max_size
>>> transform1 = A.LongestMaxSize(max_size=1024, area_for_downscale="image")
>>> # Input image (1500, 800) -> Output (1024, 546)
>>>
>>> # Using max_size_hw with both dimensions
>>> transform2 = A.LongestMaxSize(max_size_hw=(800, 1024), area_for_downscale="image_mask")
>>> # Input (1500, 800) -> Output (800, 427)
>>> # Input (800, 1500) -> Output (546, 1024)
>>>
>>> # Using max_size_hw with only height
>>> transform3 = A.LongestMaxSize(max_size_hw=(800, None))
>>> # Input (1500, 800) -> Output (800, 427)
>>>
>>> # Common use case with padding
>>> transform4 = A.Compose([
... A.LongestMaxSize(max_size=1024, area_for_downscale="image"),
... A.PadIfNeeded(min_height=1024, min_width=1024),
... ])Notes
- This transform scales images based on their longest side: * If the longest side is **smaller** than max_size: the image will be **upscaled** (scale > 1.0) * If the longest side is **equal** to max_size: the image will **not be resized** (scale = 1.0) * If the longest side is **larger** than max_size: the image will be **downscaled** (scale < 1.0) - This transform will not crop the image. The resulting image may be smaller than specified in both dimensions. - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio. - Bounding boxes and keypoints are scaled accordingly. - When area_for_downscale is set, INTER_AREA will be used for downscaling, providing better quality.
RandomScaleclass
RandomScale(
scale_limit: tuple[float, float] | float = (-0.1, 0.1),
interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
area_for_downscale: 'image' | 'image_mask' | None,
p: float = 0.5
)Randomly resize the input. Output image size is different from the input image size.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| scale_limit | One of:
| (-0.1, 0.1) | scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1). |
| interpolation | One of:
| 1 | flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
| mask_interpolation | One of:
| 0 | flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST. |
| area_for_downscale | One of:
| - | Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None. |
| p | float | 0.5 | probability of applying the transform. Default: 0.5. |
Examples
>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Create sample data for demonstration
>>> image = np.zeros((100, 100, 3), dtype=np.uint8)
>>> # Add some shapes to visualize scaling effects
>>> cv2.rectangle(image, (25, 25), (75, 75), (255, 0, 0), -1) # Red square
>>> cv2.circle(image, (50, 50), 10, (0, 255, 0), -1) # Green circle
>>>
>>> # Create a mask for segmentation
>>> mask = np.zeros((100, 100), dtype=np.uint8)
>>> mask[25:75, 25:75] = 1 # Mask covering the red square
>>>
>>> # Create bounding boxes and keypoints
>>> bboxes = np.array([[25, 25, 75, 75]]) # Box around the red square
>>> bbox_labels = [1]
>>> keypoints = np.array([[50, 50]]) # Center of circle
>>> keypoint_labels = [0]
>>>
>>> # Apply RandomScale transform with comprehensive parameters
>>> transform = A.Compose([
... A.RandomScale(
... scale_limit=(-0.3, 0.5), # Scale between 0.7x and 1.5x
... interpolation=cv2.INTER_LINEAR,
... mask_interpolation=cv2.INTER_NEAREST,
... area_for_downscale="image", # Use INTER_AREA for image downscaling
... p=1.0 # Always apply
... )
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform to all targets
>>> result = transform(
... image=image,
... mask=mask,
... bboxes=bboxes,
... bbox_labels=bbox_labels,
... keypoints=keypoints,
... keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed results
>>> scaled_image = result['image'] # Dimensions will be between 70-150 pixels
>>> scaled_mask = result['mask'] # Mask scaled proportionally to image
>>> scaled_bboxes = result['bboxes'] # Bounding boxes adjusted to new dimensions
>>> scaled_bbox_labels = result['bbox_labels'] # Labels remain unchanged
>>> scaled_keypoints = result['keypoints'] # Keypoints adjusted to new dimensions
>>> scaled_keypoint_labels = result['keypoint_labels'] # Labels remain unchanged
>>>
>>> # The image dimensions will vary based on the randomly sampled scale factor
>>> # With scale_limit=(-0.3, 0.5), dimensions could be anywhere from 70% to 150% of originalNotes
- The output image size is different from the input image size. - Scale factor is sampled independently per image side (width and height). - Bounding box coordinates are scaled accordingly. - Keypoint coordinates are scaled accordingly. - When area_for_downscale is set, INTER_AREA interpolation will be used automatically for downscaling (scale < 1.0), which provides better quality for size reduction.
Resizeclass
Resize(
height: int,
width: int,
interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
area_for_downscale: 'image' | 'image_mask' | None,
p: float = 1
)Resize the input to the given height and width.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| height | int | - | desired height of the output. |
| width | int | - | desired width of the output. |
| interpolation | One of:
| 1 | flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
| mask_interpolation | One of:
| 0 | flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST. |
| area_for_downscale | One of:
| - | Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None. |
| p | float | 1 | probability of applying the transform. Default: 1. |
Examples
>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Create sample data for demonstration
>>> image = np.zeros((100, 100, 3), dtype=np.uint8)
>>> # Add some shapes to visualize resize effects
>>> cv2.rectangle(image, (25, 25), (75, 75), (255, 0, 0), -1) # Red square
>>> cv2.circle(image, (50, 50), 10, (0, 255, 0), -1) # Green circle
>>>
>>> # Create a mask for segmentation
>>> mask = np.zeros((100, 100), dtype=np.uint8)
>>> mask[25:75, 25:75] = 1 # Mask covering the red square
>>>
>>> # Create bounding boxes and keypoints
>>> bboxes = np.array([[25, 25, 75, 75]]) # Box around the red square
>>> bbox_labels = [1]
>>> keypoints = np.array([[50, 50]]) # Center of circle
>>> keypoint_labels = [0]
>>>
>>> # Resize all data to 224x224 (common input size for many CNNs)
>>> transform = A.Compose([
... A.Resize(
... height=224,
... width=224,
... interpolation=cv2.INTER_LINEAR,
... mask_interpolation=cv2.INTER_NEAREST,
... area_for_downscale="image", # Use INTER_AREA when downscaling images
... p=1.0
... )
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform to all targets
>>> result = transform(
... image=image,
... mask=mask,
... bboxes=bboxes,
... bbox_labels=bbox_labels,
... keypoints=keypoints,
... keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed results
>>> resized_image = result['image'] # Shape will be (224, 224, 3)
>>> resized_mask = result['mask'] # Shape will be (224, 224)
>>> resized_bboxes = result['bboxes'] # Bounding boxes scaled to new dimensions
>>> resized_bbox_labels = result['bbox_labels'] # Labels remain unchanged
>>> resized_keypoints = result['keypoints'] # Keypoints scaled to new dimensions
>>> resized_keypoint_labels = result['keypoint_labels'] # Labels remain unchanged
>>>
>>> # Note: When resizing from 100x100 to 224x224:
>>> # - The red square will be scaled from (25-75) to approximately (56-168)
>>> # - The keypoint at (50, 50) will move to approximately (112, 112)
>>> # - All spatial relationships are preserved but coordinates are scaledSmallestMaxSizeclass
SmallestMaxSize(
max_size: int | Sequence | None,
max_size_hw: tuple[int | None, int | None] | None,
interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
area_for_downscale: 'image' | 'image_mask' | None,
p: float = 1
)Rescale an image so that minimum side is equal to max_size or sides meet max_size_hw constraints, keeping the aspect ratio.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| max_size | One of:
| - | Maximum size of smallest side of the image after the transformation. When using a list, max size will be randomly selected from the values in the list. Default: None. |
| max_size_hw | One of:
| - | Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must be at least these values - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None. |
| interpolation | One of:
| 1 | Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
| mask_interpolation | One of:
| 0 | flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST. |
| area_for_downscale | One of:
| - | Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None. |
| p | float | 1 | Probability of applying the transform. Default: 1. |
Examples
>>> import numpy as np
>>> import albumentations as A
>>> # Using max_size
>>> transform1 = A.SmallestMaxSize(max_size=120, area_for_downscale="image")
>>> # Input image (100, 150) -> Output (120, 180)
>>>
>>> # Using max_size_hw with both dimensions
>>> transform2 = A.SmallestMaxSize(max_size_hw=(100, 200), area_for_downscale="image_mask")
>>> # Input (80, 160) -> Output (100, 200)
>>> # Input (160, 80) -> Output (400, 200)
>>>
>>> # Using max_size_hw with only height
>>> transform3 = A.SmallestMaxSize(max_size_hw=(100, None))
>>> # Input (80, 160) -> Output (100, 200)Notes
- This transform scales images based on their smallest side: * If the smallest side is **smaller** than max_size: the image will be **upscaled** (scale > 1.0) * If the smallest side is **equal** to max_size: the image will **not be resized** (scale = 1.0) * If the smallest side is **larger** than max_size: the image will be **downscaled** (scale < 1.0) - This transform will not crop the image. The resulting image may be larger than specified in both dimensions. - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio. - Bounding boxes and keypoints are scaled accordingly. - When area_for_downscale is set, INTER_AREA will be used for downscaling, providing better quality.