albumentations.augmentations.geometric.resize
Scale image to fit a target canvas preserving aspect ratio, then pad to exact canvas size: YOLO letterbox, equivalent to LongestMaxSize + PadIfNeeded.
Members
- classLetterBox
- classLongestMaxSize
- classRandomScale
- classResize
- classSmallestMaxSize
LetterBoxclass
LetterBox(
size: tuple[int, int],
interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
fill: tuple[float, ...] | float = 114,
fill_mask: tuple[float, ...] | float = 0,
position: 'center' | 'top_left' | 'top_right' | 'bottom_left' | 'bottom_right' | 'random' = center,
p: float = 1.0
)Scale image to fit a target canvas preserving aspect ratio, then pad to exact canvas size: YOLO letterbox, equivalent to LongestMaxSize + PadIfNeeded. The image is downscaled or upscaled so its longest side fits the target, then constant-color padding fills the remaining area. All targets (masks, bboxes, keypoints) are adjusted accordingly.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| size | tuple[int, int] | - | Target `(height, width)` of the output canvas. |
| interpolation | One of:
| 1 | Interpolation method used when resizing the image. Default: `cv2.INTER_LINEAR`. |
| mask_interpolation | One of:
| 0 | Interpolation method used when resizing masks. Default: `cv2.INTER_NEAREST`. |
| fill | One of:
| 114 | Constant pixel value for image padding. Default: `114`. |
| fill_mask | One of:
| 0 | Constant pixel value for mask padding. Default: `0`. |
| position | One of:
| center | Where to place the resized image on the canvas. Default: `"center"`. |
| p | float | 1.0 | Probability of applying the transform. Default: `1.0`. |
Examples
>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>> image = np.random.randint(0, 256, (480, 640, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (480, 640), dtype=np.uint8)
>>> bboxes = np.array([[100, 80, 300, 200]], dtype=np.float32)
>>> bbox_labels = [1]
>>> keypoints = np.array([[200, 150]], dtype=np.float32)
>>> keypoint_labels = [0]
>>>
>>> transform = A.Compose([
... A.LetterBox(size=(640, 640), fill=114, fill_mask=0, p=1.0)
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> result = transform(
... image=image,
... mask=mask,
... bboxes=bboxes,
... bbox_labels=bbox_labels,
... keypoints=keypoints,
... keypoint_labels=keypoint_labels,
... )
>>> result['image'].shape
(640, 640, 3)Notes
- The output size is always exactly `(height, width)`. - Images smaller than the target are upscaled; images larger are downscaled. - Bounding boxes and keypoints are adjusted for both the resize and padding steps. - `fill=114` is the YOLO convention for letterbox padding.
LongestMaxSizeclass
LongestMaxSize(
max_size: int | Sequence | None,
max_size_hw: tuple[int | None, int | None] | None,
interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
area_for_downscale: 'image' | 'image_mask' | None,
p: float = 1
)Rescale an image so that the longest side is equal to max_size or sides meet max_size_hw constraints, keeping the aspect ratio.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| max_size | One of:
| - | Maximum size of the longest side after the transformation. When using a list or tuple, the max size will be randomly selected from the values provided. Default: None. |
| max_size_hw | One of:
| - | Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must fit within these bounds - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None. |
| interpolation | One of:
| 1 | interpolation method. Default: cv2.INTER_LINEAR. |
| mask_interpolation | One of:
| 0 | flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST. |
| area_for_downscale | One of:
| - | Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None. |
| p | float | 1 | probability of applying the transform. Default: 1. |
Examples
>>> import albumentations as A
>>> import cv2
>>> # Using max_size
>>> transform1 = A.LongestMaxSize(max_size=1024, area_for_downscale="image")
>>> # Input image (1500, 800) -> Output (1024, 546)
>>>
>>> # Using max_size_hw with both dimensions
>>> transform2 = A.LongestMaxSize(max_size_hw=(800, 1024), area_for_downscale="image_mask")
>>> # Input (1500, 800) -> Output (800, 427)
>>> # Input (800, 1500) -> Output (546, 1024)
>>>
>>> # Using max_size_hw with only height
>>> transform3 = A.LongestMaxSize(max_size_hw=(800, None))
>>> # Input (1500, 800) -> Output (800, 427)
>>>
>>> # Common use case with padding
>>> transform4 = A.Compose([
... A.LongestMaxSize(max_size=1024, area_for_downscale="image"),
... A.PadIfNeeded(min_height=1024, min_width=1024),
... ])Notes
- This transform scales images based on their longest side: * If the longest side is **smaller** than max_size: the image will be **upscaled** (scale > 1.0) * If the longest side is **equal** to max_size: the image will **not be resized** (scale = 1.0) * If the longest side is **larger** than max_size: the image will be **downscaled** (scale < 1.0) - This transform will not crop the image. The resulting image may be smaller than specified in both dimensions. - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio. - Bounding boxes and keypoints are scaled accordingly. - When area_for_downscale is set, INTER_AREA will be used for downscaling, providing better quality.
RandomScaleclass
RandomScale(
scale_limit: tuple[float, float] | float = (-0.1, 0.1),
interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
area_for_downscale: 'image' | 'image_mask' | None,
p: float = 0.5
)Resize by a random scale factor (scale_limit). Output size differs from input; all targets scaled together. Useful for scale augmentation without cropping.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| scale_limit | One of:
| (-0.1, 0.1) | scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1). |
| interpolation | One of:
| 1 | flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
| mask_interpolation | One of:
| 0 | flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST. |
| area_for_downscale | One of:
| - | Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None. |
| p | float | 0.5 | probability of applying the transform. Default: 0.5. |
Examples
>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Create sample data for demonstration
>>> image = np.zeros((100, 100, 3), dtype=np.uint8)
>>> # Add some shapes to visualize scaling effects
>>> cv2.rectangle(image, (25, 25), (75, 75), (255, 0, 0), -1) # Red square
>>> cv2.circle(image, (50, 50), 10, (0, 255, 0), -1) # Green circle
>>>
>>> # Create a mask for segmentation
>>> mask = np.zeros((100, 100), dtype=np.uint8)
>>> mask[25:75, 25:75] = 1 # Mask covering the red square
>>>
>>> # Create bounding boxes and keypoints
>>> bboxes = np.array([[25, 25, 75, 75]]) # Box around the red square
>>> bbox_labels = [1]
>>> keypoints = np.array([[50, 50]]) # Center of circle
>>> keypoint_labels = [0]
>>>
>>> # Apply RandomScale transform with comprehensive parameters
>>> transform = A.Compose([
... A.RandomScale(
... scale_limit=(-0.3, 0.5), # Scale between 0.7x and 1.5x
... interpolation=cv2.INTER_LINEAR,
... mask_interpolation=cv2.INTER_NEAREST,
... area_for_downscale="image", # Use INTER_AREA for image downscaling
... p=1.0 # Always apply
... )
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform to all targets
>>> result = transform(
... image=image,
... mask=mask,
... bboxes=bboxes,
... bbox_labels=bbox_labels,
... keypoints=keypoints,
... keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed results
>>> scaled_image = result['image'] # Dimensions will be between 70-150 pixels
>>> scaled_mask = result['mask'] # Mask scaled proportionally to image
>>> scaled_bboxes = result['bboxes'] # Bounding boxes adjusted to new dimensions
>>> scaled_bbox_labels = result['bbox_labels'] # Labels remain unchanged
>>> scaled_keypoints = result['keypoints'] # Keypoints adjusted to new dimensions
>>> scaled_keypoint_labels = result['keypoint_labels'] # Labels remain unchanged
>>>
>>> # The image dimensions will vary based on the randomly sampled scale factor
>>> # With scale_limit=(-0.3, 0.5), dimensions could be anywhere from 70% to 150% of originalNotes
- The output image size is different from the input image size. - Scale factor is sampled independently per image side (width and height). - Bounding box coordinates are scaled accordingly. - Keypoint coordinates are scaled accordingly. - When area_for_downscale is set, INTER_AREA interpolation will be used automatically for downscaling (scale < 1.0), which provides better quality for size reduction.
Resizeclass
Resize(
height: int,
width: int,
interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
area_for_downscale: 'image' | 'image_mask' | None,
p: float = 1
)Resize to given height and width. Params: height, width, interpolation, area_for_downscale. Supports image, mask, bboxes, keypoints.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| height | int | - | desired height of the output. |
| width | int | - | desired width of the output. |
| interpolation | One of:
| 1 | flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
| mask_interpolation | One of:
| 0 | flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST. |
| area_for_downscale | One of:
| - | Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None. |
| p | float | 1 | probability of applying the transform. Default: 1. |
Examples
>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Create sample data for demonstration
>>> image = np.zeros((100, 100, 3), dtype=np.uint8)
>>> # Add some shapes to visualize resize effects
>>> cv2.rectangle(image, (25, 25), (75, 75), (255, 0, 0), -1) # Red square
>>> cv2.circle(image, (50, 50), 10, (0, 255, 0), -1) # Green circle
>>>
>>> # Create a mask for segmentation
>>> mask = np.zeros((100, 100), dtype=np.uint8)
>>> mask[25:75, 25:75] = 1 # Mask covering the red square
>>>
>>> # Create bounding boxes and keypoints
>>> bboxes = np.array([[25, 25, 75, 75]]) # Box around the red square
>>> bbox_labels = [1]
>>> keypoints = np.array([[50, 50]]) # Center of circle
>>> keypoint_labels = [0]
>>>
>>> # Resize all data to 224x224 (common input size for many CNNs)
>>> transform = A.Compose([
... A.Resize(
... height=224,
... width=224,
... interpolation=cv2.INTER_LINEAR,
... mask_interpolation=cv2.INTER_NEAREST,
... area_for_downscale="image", # Use INTER_AREA when downscaling images
... p=1.0
... )
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> # Apply the transform to all targets
>>> result = transform(
... image=image,
... mask=mask,
... bboxes=bboxes,
... bbox_labels=bbox_labels,
... keypoints=keypoints,
... keypoint_labels=keypoint_labels
... )
>>>
>>> # Get the transformed results
>>> resized_image = result['image'] # Shape will be (224, 224, 3)
>>> resized_mask = result['mask'] # Shape will be (224, 224)
>>> resized_bboxes = result['bboxes'] # Bounding boxes scaled to new dimensions
>>> resized_bbox_labels = result['bbox_labels'] # Labels remain unchanged
>>> resized_keypoints = result['keypoints'] # Keypoints scaled to new dimensions
>>> resized_keypoint_labels = result['keypoint_labels'] # Labels remain unchanged
>>>
>>> # Note: When resizing from 100x100 to 224x224:
>>> # - The red square will be scaled from (25-75) to approximately (56-168)
>>> # - The keypoint at (50, 50) will move to approximately (112, 112)
>>> # - All spatial relationships are preserved but coordinates are scaledSmallestMaxSizeclass
SmallestMaxSize(
max_size: int | Sequence | None,
max_size_hw: tuple[int | None, int | None] | None,
interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
mask_interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 0,
area_for_downscale: 'image' | 'image_mask' | None,
p: float = 1
)Rescale an image so that minimum side is equal to max_size or sides meet max_size_hw constraints, keeping the aspect ratio.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| max_size | One of:
| - | Maximum size of smallest side of the image after the transformation. When using a list, max size will be randomly selected from the values in the list. Default: None. |
| max_size_hw | One of:
| - | Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must be at least these values - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None. |
| interpolation | One of:
| 1 | Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
| mask_interpolation | One of:
| 0 | flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST. |
| area_for_downscale | One of:
| - | Controls automatic use of INTER_AREA interpolation for downscaling. Options: - None: No automatic interpolation selection, always use the specified interpolation method - "image": Use INTER_AREA when downscaling images, retain specified interpolation for upscaling and masks - "image_mask": Use INTER_AREA when downscaling both images and masks Default: None. |
| p | float | 1 | Probability of applying the transform. Default: 1. |
Examples
>>> import numpy as np
>>> import albumentations as A
>>> # Using max_size
>>> transform1 = A.SmallestMaxSize(max_size=120, area_for_downscale="image")
>>> # Input image (100, 150) -> Output (120, 180)
>>>
>>> # Using max_size_hw with both dimensions
>>> transform2 = A.SmallestMaxSize(max_size_hw=(100, 200), area_for_downscale="image_mask")
>>> # Input (80, 160) -> Output (100, 200)
>>> # Input (160, 80) -> Output (400, 200)
>>>
>>> # Using max_size_hw with only height
>>> transform3 = A.SmallestMaxSize(max_size_hw=(100, None))
>>> # Input (80, 160) -> Output (100, 200)Notes
- This transform scales images based on their smallest side: * If the smallest side is **smaller** than max_size: the image will be **upscaled** (scale > 1.0) * If the smallest side is **equal** to max_size: the image will **not be resized** (scale = 1.0) * If the smallest side is **larger** than max_size: the image will be **downscaled** (scale < 1.0) - This transform will not crop the image. The resulting image may be larger than specified in both dimensions. - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio. - Bounding boxes and keypoints are scaled accordingly. - When area_for_downscale is set, INTER_AREA will be used for downscaling, providing better quality.