Skip to content

Rotation transforms (augmentations.geometric.functional)

class RandomRotate90 [view source on GitHub]

Randomly rotate the input by 90 degrees zero or more times.

Parameters:

Name Type Description
p

probability of applying the transform. Default: 0.5.

Targets

image, mask, bboxes, keypoints

Image types: uint8, float32

Interactive Tool Available!

Explore this transform visually and adjust parameters interactively using this tool:

Open Tool

Source code in albumentations/augmentations/geometric/rotate.py
Python
class RandomRotate90(DualTransform):
    """Randomly rotate the input by 90 degrees zero or more times.

    Args:
        p: probability of applying the transform. Default: 0.5.

    Targets:
        image, mask, bboxes, keypoints

    Image types:
        uint8, float32

    """

    _targets = (Targets.IMAGE, Targets.MASK, Targets.BBOXES, Targets.KEYPOINTS)

    def apply(self, img: np.ndarray, factor: int, **params: Any) -> np.ndarray:
        return fgeometric.rot90(img, factor)

    def get_params(self) -> dict[str, int]:
        # Random int in the range [0, 3]
        return {"factor": self.py_random.randint(0, 3)}

    def apply_to_bboxes(self, bboxes: np.ndarray, factor: int, **params: Any) -> np.ndarray:
        return fgeometric.bboxes_rot90(bboxes, factor)

    def apply_to_keypoints(self, keypoints: np.ndarray, factor: int, **params: Any) -> np.ndarray:
        return fgeometric.keypoints_rot90(keypoints, factor, params["shape"])

    def get_transform_init_args_names(self) -> tuple[()]:
        return ()

class Rotate (limit=(-90, 90), interpolation=1, border_mode=4, value=None, mask_value=None, rotate_method='largest_box', crop_border=False, mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None) [view source on GitHub]

Rotate the input by an angle selected randomly from the uniform distribution.

Parameters:

Name Type Description
limit float | tuple[float, float]

Range from which a random angle is picked. If limit is a single float, an angle is picked from (-limit, limit). Default: (-90, 90)

interpolation OpenCV flag

Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.

border_mode OpenCV flag

Flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101

fill ColorType

Padding value if border_mode is cv2.BORDER_CONSTANT.

fill_mask ColorType

Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.

rotate_method str

Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'. Default: 'largest_box'

crop_border bool

Whether to crop border after rotation. If True, the output image size might differ from the input. Default: False

mask_interpolation OpenCV flag

flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.

p float

Probability of applying the transform. Default: 0.5.

Targets

image, mask, bboxes, keypoints

Image types: uint8, float32

Note

  • The rotation angle is randomly selected for each execution within the range specified by 'limit'.
  • When 'crop_border' is False, the output image will have the same size as the input, potentially introducing black triangles in the corners.
  • When 'crop_border' is True, the output image is cropped to remove black triangles, which may result in a smaller image.
  • Bounding boxes are rotated and may change size or shape.
  • Keypoints are rotated around the center of the image.

Mathematical Details: 1. An angle θ is randomly sampled from the range specified by 'limit'. 2. The image is rotated around its center by θ degrees. 3. The rotation matrix R is: R = [cos(θ) -sin(θ)] [sin(θ) cos(θ)] 4. Each point (x, y) in the image is transformed to (x', y') by: [x'] [cos(θ) -sin(θ)][x - cx] [cx] [y'] = [sin(θ) cos(θ)][y - cy] + [cy] where (cx, cy) is the center of the image. 5. If 'crop_border' is True, the image is cropped to the largest rectangle that fits inside the rotated image.

Examples:

Python
>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.Rotate(limit=45, p=1.0)
>>> result = transform(image=image)
>>> rotated_image = result['image']
# rotated_image will be the input image rotated by a random angle between -45 and 45 degrees

Interactive Tool Available!

Explore this transform visually and adjust parameters interactively using this tool:

Open Tool

Source code in albumentations/augmentations/geometric/rotate.py
Python
class Rotate(DualTransform):
    """Rotate the input by an angle selected randomly from the uniform distribution.

    Args:
        limit (float | tuple[float, float]): Range from which a random angle is picked. If limit is a single float,
            an angle is picked from (-limit, limit). Default: (-90, 90)
        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:
            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.
            Default: cv2.INTER_LINEAR.
        border_mode (OpenCV flag): Flag that is used to specify the pixel extrapolation method. Should be one of:
            cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.
            Default: cv2.BORDER_REFLECT_101
        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.
        fill_mask (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.
        rotate_method (str): Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'.
            Default: 'largest_box'
        crop_border (bool): Whether to crop border after rotation. If True, the output image size might differ
            from the input. Default: False
        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.
            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.
            Default: cv2.INTER_NEAREST.
        p (float): Probability of applying the transform. Default: 0.5.

    Targets:
        image, mask, bboxes, keypoints

    Image types:
        uint8, float32

    Note:
        - The rotation angle is randomly selected for each execution within the range specified by 'limit'.
        - When 'crop_border' is False, the output image will have the same size as the input, potentially
          introducing black triangles in the corners.
        - When 'crop_border' is True, the output image is cropped to remove black triangles, which may result
          in a smaller image.
        - Bounding boxes are rotated and may change size or shape.
        - Keypoints are rotated around the center of the image.

    Mathematical Details:
        1. An angle θ is randomly sampled from the range specified by 'limit'.
        2. The image is rotated around its center by θ degrees.
        3. The rotation matrix R is:
           R = [cos(θ)  -sin(θ)]
               [sin(θ)   cos(θ)]
        4. Each point (x, y) in the image is transformed to (x', y') by:
           [x']   [cos(θ)  -sin(θ)] [x - cx]   [cx]
           [y'] = [sin(θ)   cos(θ)] [y - cy] + [cy]
           where (cx, cy) is the center of the image.
        5. If 'crop_border' is True, the image is cropped to the largest rectangle that fits inside the rotated image.

    Example:
        >>> import numpy as np
        >>> import albumentations as A
        >>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
        >>> transform = A.Rotate(limit=45, p=1.0)
        >>> result = transform(image=image)
        >>> rotated_image = result['image']
        # rotated_image will be the input image rotated by a random angle between -45 and 45 degrees
    """

    _targets = (Targets.IMAGE, Targets.MASK, Targets.BBOXES, Targets.KEYPOINTS)

    class InitSchema(RotateInitSchema):
        rotate_method: Literal["largest_box", "ellipse"]
        crop_border: bool

        fill: ColorType
        fill_mask: ColorType

        value: ColorType | None = Field(default=None, deprecated="Deprecated use fill instead")
        mask_value: ColorType | None = Field(default=None, deprecated="Deprecated use fill_mask instead")

        @model_validator(mode="after")
        def validate_value(self) -> Self:
            if self.value is not None:
                self.fill = self.value
            if self.mask_value is not None:
                self.fill_mask = self.mask_value
            return self

    def __init__(
        self,
        limit: ScaleFloatType = (-90, 90),
        interpolation: int = cv2.INTER_LINEAR,
        border_mode: int = cv2.BORDER_REFLECT_101,
        value: ColorType | None = None,
        mask_value: ColorType | None = None,
        rotate_method: Literal["largest_box", "ellipse"] = "largest_box",
        crop_border: bool = False,
        mask_interpolation: int = cv2.INTER_NEAREST,
        fill: ColorType = 0,
        fill_mask: ColorType = 0,
        p: float = 0.5,
        always_apply: bool | None = None,
    ):
        super().__init__(p=p, always_apply=always_apply)
        self.limit = cast(tuple[float, float], limit)
        self.interpolation = interpolation
        self.mask_interpolation = mask_interpolation
        self.border_mode = border_mode
        self.fill = fill
        self.fill_mask = fill_mask
        self.rotate_method = rotate_method
        self.crop_border = crop_border

    def apply(
        self,
        img: np.ndarray,
        matrix: np.ndarray,
        x_min: int,
        x_max: int,
        y_min: int,
        y_max: int,
        **params: Any,
    ) -> np.ndarray:
        img_out = fgeometric.warp_affine(
            img,
            matrix,
            self.interpolation,
            self.fill,
            self.border_mode,
            params["shape"][:2],
        )
        if self.crop_border:
            return fcrops.crop(img_out, x_min, y_min, x_max, y_max)
        return img_out

    def apply_to_mask(
        self,
        mask: np.ndarray,
        matrix: np.ndarray,
        x_min: int,
        x_max: int,
        y_min: int,
        y_max: int,
        **params: Any,
    ) -> np.ndarray:
        img_out = fgeometric.warp_affine(
            mask,
            matrix,
            self.mask_interpolation,
            self.fill_mask,
            self.border_mode,
            params["shape"][:2],
        )
        if self.crop_border:
            return fcrops.crop(img_out, x_min, y_min, x_max, y_max)
        return img_out

    def apply_to_bboxes(
        self,
        bboxes: np.ndarray,
        bbox_matrix: np.ndarray,
        x_min: int,
        x_max: int,
        y_min: int,
        y_max: int,
        **params: Any,
    ) -> np.ndarray:
        image_shape = params["shape"][:2]
        bboxes_out = fgeometric.bboxes_affine(
            bboxes,
            bbox_matrix,
            self.rotate_method,
            image_shape,
            self.border_mode,
            image_shape,
        )
        if self.crop_border:
            return fcrops.crop_bboxes_by_coords(bboxes_out, (x_min, y_min, x_max, y_max), image_shape)
        return bboxes_out

    def apply_to_keypoints(
        self,
        keypoints: np.ndarray,
        matrix: np.ndarray,
        x_min: int,
        x_max: int,
        y_min: int,
        y_max: int,
        **params: Any,
    ) -> np.ndarray:
        keypoints_out = fgeometric.keypoints_affine(
            keypoints,
            matrix,
            params["shape"][:2],
            scale={"x": 1, "y": 1},
            border_mode=self.border_mode,
        )
        if self.crop_border:
            return fcrops.crop_keypoints_by_coords(keypoints_out, (x_min, y_min, x_max, y_max))
        return keypoints_out

    @staticmethod
    def _rotated_rect_with_max_area(height: int, width: int, angle: float) -> dict[str, int]:
        """Given a rectangle of size wxh that has been rotated by 'angle' (in
        degrees), computes the width and height of the largest possible
        axis-aligned rectangle (maximal area) within the rotated rectangle.

        Reference:
            https://stackoverflow.com/questions/16702966/rotate-image-and-crop-out-black-borders
        """
        angle = math.radians(angle)
        width_is_longer = width >= height
        side_long, side_short = (width, height) if width_is_longer else (height, width)

        # since the solutions for angle, -angle and 180-angle are all the same,
        # it is sufficient to look at the first quadrant and the absolute values of sin,cos:
        sin_a, cos_a = abs(math.sin(angle)), abs(math.cos(angle))
        if side_short <= 2.0 * sin_a * cos_a * side_long or abs(sin_a - cos_a) < SMALL_NUMBER:
            # half constrained case: two crop corners touch the longer side,
            # the other two corners are on the mid-line parallel to the longer line
            x = 0.5 * side_short
            wr, hr = (x / sin_a, x / cos_a) if width_is_longer else (x / cos_a, x / sin_a)
        else:
            # fully constrained case: crop touches all 4 sides
            cos_2a = cos_a * cos_a - sin_a * sin_a
            wr, hr = (width * cos_a - height * sin_a) / cos_2a, (height * cos_a - width * sin_a) / cos_2a

        return {
            "x_min": max(0, int(width / 2 - wr / 2)),
            "x_max": min(width, int(width / 2 + wr / 2)),
            "y_min": max(0, int(height / 2 - hr / 2)),
            "y_max": min(height, int(height / 2 + hr / 2)),
        }

    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -> dict[str, Any]:
        angle = self.py_random.uniform(*self.limit)

        if self.crop_border:
            height, width = params["shape"][:2]
            out_params = self._rotated_rect_with_max_area(height, width, angle)
        else:
            out_params = {"x_min": -1, "x_max": -1, "y_min": -1, "y_max": -1}

        center = fgeometric.center(params["shape"][:2])
        bbox_center = fgeometric.center_bbox(params["shape"][:2])

        translate: fgeometric.XYInt = {"x": 0, "y": 0}
        shear: fgeometric.XYFloat = {"x": 0, "y": 0}
        scale: fgeometric.XYFloat = {"x": 1, "y": 1}
        rotate = angle

        matrix = fgeometric.create_affine_transformation_matrix(translate, shear, scale, rotate, center)
        bbox_matrix = fgeometric.create_affine_transformation_matrix(translate, shear, scale, rotate, bbox_center)
        out_params["matrix"] = matrix
        out_params["bbox_matrix"] = bbox_matrix

        return out_params

    def get_transform_init_args_names(self) -> tuple[str, ...]:
        return (
            "limit",
            "interpolation",
            "border_mode",
            "fill",
            "fill_mask",
            "rotate_method",
            "crop_border",
            "mask_interpolation",
        )

class RotateAndProject (x_angle_range=(-15, 15), y_angle_range=(-15, 15), z_angle_range=(-15, 15), focal_range=(0.5, 1.5), border_mode=0, fill=0, fill_mask=0, interpolation=1, mask_interpolation=0, p=0.5, always_apply=None) [view source on GitHub]

Applies 3D rotation to an image and projects it back to 2D plane using perspective projection.

This transform simulates viewing a 2D image from different 3D viewpoints by: 1. Rotating the image around three axes (X, Y, Z) in 3D space 2. Applying perspective projection to map the rotated image back to 2D 3. Handling different center calculations for images/keypoints and bounding boxes

The transform preserves aspect ratios and handles all target types (images, masks, keypoints, and bounding boxes) consistently.

Parameters:

Name Type Description
x_angle_range tuple[float, float]

Range for rotation around x-axis in degrees. Positive angles rotate the top edge away from viewer. Default: (-15, 15)

y_angle_range tuple[float, float]

Range for rotation around y-axis in degrees. Positive angles rotate the right edge away from viewer. Default: (-15, 15)

z_angle_range tuple[float, float]

Range for rotation around z-axis in degrees. Positive angles rotate clockwise in image plane. Default: (-15, 15)

focal_range tuple[float, float]

Range for focal length of perspective projection. Controls the strength of perspective effect: - Values < 1.0: Strong perspective (wide-angle lens effect) - Value = 1.0: Normal perspective - Values > 1.0: Weak perspective (telephoto lens effect) Default: (0.5, 1.5)

border_mode OpenCV flag

Padding mode for borders after rotation. Should be one of: - cv2.BORDER_CONSTANT: pads with constant value - cv2.BORDER_REFLECT: reflects border pixels - cv2.BORDER_REFLECT_101: reflects border pixels without duplicating edge pixels - cv2.BORDER_REPLICATE: replicates border pixels Default: cv2.BORDER_CONSTANT

fill ColorType

Padding value if border_mode is cv2.BORDER_CONSTANT. Default: 0

fill_mask ColorType

Padding value for masks if border_mode is cv2.BORDER_CONSTANT. Default: 0

interpolation OpenCV flag

Interpolation method for image transformation. Should be one of: - cv2.INTER_NEAREST: nearest-neighbor interpolation - cv2.INTER_LINEAR: bilinear interpolation - cv2.INTER_CUBIC: bicubic interpolation Default: cv2.INTER_LINEAR

mask_interpolation OpenCV flag

Interpolation method for mask transformation. Default: cv2.INTER_NEAREST

p float

Probability of applying the transform. Default: 0.5

Targets

image, mask, keypoints, bboxes

Image types: uint8, float32

Note

  • The transform maintains original image size
  • Uses different center calculations for images/keypoints (width-1)/2 vs bboxes width/2
  • Handles all coordinate transformations in homogeneous coordinates
  • Applies proper perspective transformation to bounding boxes by transforming corners

Examples:

Python
>>> import albumentations as A
>>> transform = A.RotateAndProject(
...     x_angle_range=(-30, 30),
...     y_angle_range=(-30, 30),
...     z_angle_range=(-15, 15),
...     focal_range=(0.7, 1.3),
...     p=1.0
... )
>>> result = transform(image=image, bboxes=bboxes, keypoints=keypoints)

Interactive Tool Available!

Explore this transform visually and adjust parameters interactively using this tool:

Open Tool

Source code in albumentations/augmentations/geometric/rotate.py
Python
class RotateAndProject(Perspective):
    """Applies 3D rotation to an image and projects it back to 2D plane using perspective projection.

    This transform simulates viewing a 2D image from different 3D viewpoints by:
    1. Rotating the image around three axes (X, Y, Z) in 3D space
    2. Applying perspective projection to map the rotated image back to 2D
    3. Handling different center calculations for images/keypoints and bounding boxes

    The transform preserves aspect ratios and handles all target types (images, masks,
    keypoints, and bounding boxes) consistently.

    Args:
        x_angle_range (tuple[float, float]): Range for rotation around x-axis in degrees.
            Positive angles rotate the top edge away from viewer.
            Default: (-15, 15)
        y_angle_range (tuple[float, float]): Range for rotation around y-axis in degrees.
            Positive angles rotate the right edge away from viewer.
            Default: (-15, 15)
        z_angle_range (tuple[float, float]): Range for rotation around z-axis in degrees.
            Positive angles rotate clockwise in image plane.
            Default: (-15, 15)
        focal_range (tuple[float, float]): Range for focal length of perspective projection.
            Controls the strength of perspective effect:
            - Values < 1.0: Strong perspective (wide-angle lens effect)
            - Value = 1.0: Normal perspective
            - Values > 1.0: Weak perspective (telephoto lens effect)
            Default: (0.5, 1.5)
        border_mode (OpenCV flag): Padding mode for borders after rotation.
            Should be one of:
            - cv2.BORDER_CONSTANT: pads with constant value
            - cv2.BORDER_REFLECT: reflects border pixels
            - cv2.BORDER_REFLECT_101: reflects border pixels without duplicating edge pixels
            - cv2.BORDER_REPLICATE: replicates border pixels
            Default: cv2.BORDER_CONSTANT
        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.
            Default: 0
        fill_mask (ColorType): Padding value for masks if border_mode is cv2.BORDER_CONSTANT.
            Default: 0
        interpolation (OpenCV flag): Interpolation method for image transformation.
            Should be one of:
            - cv2.INTER_NEAREST: nearest-neighbor interpolation
            - cv2.INTER_LINEAR: bilinear interpolation
            - cv2.INTER_CUBIC: bicubic interpolation
            Default: cv2.INTER_LINEAR
        mask_interpolation (OpenCV flag): Interpolation method for mask transformation.
            Default: cv2.INTER_NEAREST
        p (float): Probability of applying the transform.
            Default: 0.5

    Targets:
        image, mask, keypoints, bboxes

    Image types:
        uint8, float32

    Note:
        - The transform maintains original image size
        - Uses different center calculations for images/keypoints (width-1)/2 vs bboxes width/2
        - Handles all coordinate transformations in homogeneous coordinates
        - Applies proper perspective transformation to bounding boxes by transforming corners

    Example:
        >>> import albumentations as A
        >>> transform = A.RotateAndProject(
        ...     x_angle_range=(-30, 30),
        ...     y_angle_range=(-30, 30),
        ...     z_angle_range=(-15, 15),
        ...     focal_range=(0.7, 1.3),
        ...     p=1.0
        ... )
        >>> result = transform(image=image, bboxes=bboxes, keypoints=keypoints)
    """

    class InitSchema(BaseTransformInitSchema):
        x_angle_range: Annotated[tuple[float, float], AfterValidator(nondecreasing)]
        y_angle_range: Annotated[tuple[float, float], AfterValidator(nondecreasing)]
        z_angle_range: Annotated[tuple[float, float], AfterValidator(nondecreasing)]
        focal_range: Annotated[tuple[float, float], AfterValidator(nondecreasing)]
        mask_interpolation: InterpolationType
        interpolation: InterpolationType
        border_mode: int
        fill: ColorType
        fill_mask: ColorType

    def __init__(
        self,
        x_angle_range: tuple[float, float] = (-15, 15),
        y_angle_range: tuple[float, float] = (-15, 15),
        z_angle_range: tuple[float, float] = (-15, 15),
        focal_range: tuple[float, float] = (0.5, 1.5),
        border_mode: int = cv2.BORDER_CONSTANT,
        fill: ColorType = 0,
        fill_mask: ColorType = 0,
        interpolation: int = cv2.INTER_LINEAR,
        mask_interpolation: int = cv2.INTER_NEAREST,
        p: float = 0.5,
        always_apply: bool | None = None,
    ):
        super().__init__(
            scale=(0, 0),  # Unused but required by parent
            keep_size=True,
            border_mode=border_mode,
            fill=fill,
            fill_mask=fill_mask,
            interpolation=interpolation,
            mask_interpolation=mask_interpolation,
            p=p,
        )
        self.x_angle_range = x_angle_range
        self.y_angle_range = y_angle_range
        self.z_angle_range = z_angle_range
        self.focal_range = focal_range
        self.fill = fill
        self.fill_mask = fill_mask
        self.interpolation = interpolation
        self.mask_interpolation = mask_interpolation

    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -> dict[str, Any]:
        image_shape = params["shape"][:2]

        height, width = image_shape
        # Sample parameters
        x_angle = np.deg2rad(self.py_random.uniform(*self.x_angle_range))
        y_angle = np.deg2rad(self.py_random.uniform(*self.y_angle_range))
        z_angle = np.deg2rad(self.py_random.uniform(*self.z_angle_range))
        focal_length = self.py_random.uniform(*self.focal_range)

        # Get projection matrix
        matrix = fgeometric.get_projection_matrix(
            image_shape,
            x_angle,
            y_angle,
            z_angle,
            focal_length,
            fgeometric.center(image_shape),
        )

        matrix_bbox = fgeometric.get_projection_matrix(
            image_shape,
            x_angle,
            y_angle,
            z_angle,
            focal_length,
            fgeometric.center_bbox(image_shape),
        )

        return {"matrix": matrix, "max_height": height, "max_width": width, "matrix_bbox": matrix_bbox}

    def get_transform_init_args_names(self) -> tuple[str, ...]:
        return (
            "x_angle_range",
            "y_angle_range",
            "z_angle_range",
            "focal_range",
            "border_mode",
            "fill",
            "fill_mask",
            "interpolation",
            "mask_interpolation",
        )

    def apply_to_bboxes(
        self,
        bboxes: np.ndarray,
        matrix_bbox: np.ndarray,
        max_height: int,
        max_width: int,
        **params: Any,
    ) -> np.ndarray:
        return fgeometric.perspective_bboxes(bboxes, params["shape"], matrix_bbox, max_width, max_height, True)

class RotateInitSchema

Interactive Tool Available!

Explore this transform visually and adjust parameters interactively using this tool:

Open Tool

Source code in albumentations/augmentations/geometric/rotate.py
Python
class RotateInitSchema(BaseTransformInitSchema):
    limit: SymmetricRangeType

    interpolation: InterpolationType
    mask_interpolation: InterpolationType

    border_mode: BorderModeType

    fill: ColorType | None
    fill_mask: ColorType | None

class SafeRotate (limit=(-90, 90), interpolation=1, border_mode=4, value=None, mask_value=None, rotate_method='largest_box', mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None) [view source on GitHub]

Rotate the input inside the input's frame by an angle selected randomly from the uniform distribution.

This transformation ensures that the entire rotated image fits within the original frame by scaling it down if necessary. The resulting image maintains its original dimensions but may contain artifacts due to the rotation and scaling process.

Parameters:

Name Type Description
limit float | tuple[float, float]

Range from which a random angle is picked. If limit is a single float, an angle is picked from (-limit, limit). Default: (-90, 90)

interpolation OpenCV flag

Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.

border_mode OpenCV flag

Flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101

fill ColorType

Padding value if border_mode is cv2.BORDER_CONSTANT.

fill_mask ColorType

Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.

rotate_method Literal["largest_box", "ellipse"]

Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'. Default: 'largest_box'

mask_interpolation OpenCV flag

flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.

p float

Probability of applying the transform. Default: 0.5.

Targets

image, mask, bboxes, keypoints

Image types: uint8, float32

Note

  • The rotation is performed around the center of the image.
  • After rotation, the image is scaled to fit within the original frame, which may cause some distortion.
  • The output image will always have the same dimensions as the input image.
  • Bounding boxes and keypoints are transformed along with the image.

Mathematical Details: 1. An angle θ is randomly sampled from the range specified by 'limit'. 2. The image is rotated around its center by θ degrees. 3. The rotation matrix R is: R = [cos(θ) -sin(θ)] [sin(θ) cos(θ)] 4. The scaling factor s is calculated to ensure the rotated image fits within the original frame: s = min(width / (width * |cos(θ)| + height * |sin(θ)|), height / (width * |sin(θ)| + height * |cos(θ)|)) 5. The combined transformation matrix T is: T = [scos(θ) -ssin(θ) tx] [ssin(θ) scos(θ) ty] where tx and ty are translation factors to keep the image centered. 6. Each point (x, y) in the image is transformed to (x', y') by: [x'] [scos(θ) ssin(θ)][x - cx] [cx] [y'] = [-ssin(θ) scos(θ)][y - cy] + [cy] where (cx, cy) is the center of the image.

Examples:

Python
>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.SafeRotate(limit=45, p=1.0)
>>> result = transform(image=image)
>>> rotated_image = result['image']
# rotated_image will be the input image rotated by a random angle between -45 and 45 degrees,
# scaled to fit within the original 100x100 frame

Interactive Tool Available!

Explore this transform visually and adjust parameters interactively using this tool:

Open Tool

Source code in albumentations/augmentations/geometric/rotate.py
Python
class SafeRotate(Affine):
    """Rotate the input inside the input's frame by an angle selected randomly from the uniform distribution.

    This transformation ensures that the entire rotated image fits within the original frame by scaling it
    down if necessary. The resulting image maintains its original dimensions but may contain artifacts due to the
    rotation and scaling process.

    Args:
        limit (float | tuple[float, float]): Range from which a random angle is picked. If limit is a single float,
            an angle is picked from (-limit, limit). Default: (-90, 90)
        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:
            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.
            Default: cv2.INTER_LINEAR.
        border_mode (OpenCV flag): Flag that is used to specify the pixel extrapolation method. Should be one of:
            cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.
            Default: cv2.BORDER_REFLECT_101
        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.
        fill_mask (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT applied
            for masks.
        rotate_method (Literal["largest_box", "ellipse"]): Method to rotate bounding boxes.
            Should be 'largest_box' or 'ellipse'. Default: 'largest_box'
        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.
            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.
            Default: cv2.INTER_NEAREST.
        p (float): Probability of applying the transform. Default: 0.5.

    Targets:
        image, mask, bboxes, keypoints

    Image types:
        uint8, float32

    Note:
        - The rotation is performed around the center of the image.
        - After rotation, the image is scaled to fit within the original frame, which may cause some distortion.
        - The output image will always have the same dimensions as the input image.
        - Bounding boxes and keypoints are transformed along with the image.

    Mathematical Details:
        1. An angle θ is randomly sampled from the range specified by 'limit'.
        2. The image is rotated around its center by θ degrees.
        3. The rotation matrix R is:
           R = [cos(θ)  -sin(θ)]
               [sin(θ)   cos(θ)]
        4. The scaling factor s is calculated to ensure the rotated image fits within the original frame:
           s = min(width / (width * |cos(θ)| + height * |sin(θ)|),
                   height / (width * |sin(θ)| + height * |cos(θ)|))
        5. The combined transformation matrix T is:
           T = [s*cos(θ)  -s*sin(θ)  tx]
               [s*sin(θ)   s*cos(θ)  ty]
           where tx and ty are translation factors to keep the image centered.
        6. Each point (x, y) in the image is transformed to (x', y') by:
           [x']   [s*cos(θ)   s*sin(θ)] [x - cx]   [cx]
           [y'] = [-s*sin(θ)  s*cos(θ)] [y - cy] + [cy]
           where (cx, cy) is the center of the image.

    Example:
        >>> import numpy as np
        >>> import albumentations as A
        >>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
        >>> transform = A.SafeRotate(limit=45, p=1.0)
        >>> result = transform(image=image)
        >>> rotated_image = result['image']
        # rotated_image will be the input image rotated by a random angle between -45 and 45 degrees,
        # scaled to fit within the original 100x100 frame
    """

    _targets = (Targets.IMAGE, Targets.MASK, Targets.BBOXES, Targets.KEYPOINTS)

    class InitSchema(RotateInitSchema):
        rotate_method: Literal["largest_box", "ellipse"]

    def __init__(
        self,
        limit: ScaleFloatType = (-90, 90),
        interpolation: int = cv2.INTER_LINEAR,
        border_mode: int = cv2.BORDER_REFLECT_101,
        value: ColorType | None = None,
        mask_value: ColorType | None = None,
        rotate_method: Literal["largest_box", "ellipse"] = "largest_box",
        mask_interpolation: int = cv2.INTER_NEAREST,
        fill: ColorType = 0,
        fill_mask: ColorType = 0,
        p: float = 0.5,
        always_apply: bool | None = None,
    ):
        super().__init__(
            rotate=limit,
            interpolation=interpolation,
            border_mode=border_mode,
            fill=fill,
            fill_mask=fill_mask,
            rotate_method=rotate_method,
            fit_output=True,
            mask_interpolation=mask_interpolation,
            p=p,
        )
        self.limit = cast(tuple[float, float], limit)
        self.interpolation = interpolation
        self.border_mode = border_mode
        self.fill = fill
        self.fill_mask = fill_mask
        self.rotate_method = rotate_method
        self.mask_interpolation = mask_interpolation

    def get_transform_init_args_names(self) -> tuple[str, ...]:
        return (
            "limit",
            "interpolation",
            "border_mode",
            "fill",
            "fill_mask",
            "rotate_method",
            "mask_interpolation",
        )

    def _create_safe_rotate_matrix(
        self,
        angle: float,
        center: tuple[float, float],
        image_shape: tuple[int, int],
    ) -> tuple[np.ndarray, dict[str, float]]:
        height, width = image_shape[:2]
        rotation_mat = cv2.getRotationMatrix2D(center, angle, 1.0)

        # Calculate new image size
        abs_cos = abs(rotation_mat[0, 0])
        abs_sin = abs(rotation_mat[0, 1])
        new_w = int(height * abs_sin + width * abs_cos)
        new_h = int(height * abs_cos + width * abs_sin)

        # Adjust the rotation matrix to take into account the new size
        rotation_mat[0, 2] += new_w / 2 - center[0]
        rotation_mat[1, 2] += new_h / 2 - center[1]

        # Calculate scaling factors
        scale_x = width / new_w
        scale_y = height / new_h

        # Create scaling matrix
        scale_mat = np.array([[scale_x, 0, 0], [0, scale_y, 0], [0, 0, 1]])

        # Combine rotation and scaling
        matrix = scale_mat @ np.vstack([rotation_mat, [0, 0, 1]])

        return matrix, {"x": scale_x, "y": scale_y}

    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -> dict[str, Any]:
        image_shape = params["shape"][:2]
        angle = self.py_random.uniform(*self.limit)

        # Calculate centers for image and bbox
        image_center = fgeometric.center(image_shape)
        bbox_center = fgeometric.center_bbox(image_shape)

        # Create matrices for image and bbox
        matrix, scale = self._create_safe_rotate_matrix(angle, image_center, image_shape)
        bbox_matrix, _ = self._create_safe_rotate_matrix(angle, bbox_center, image_shape)

        return {
            "rotate": angle,
            "scale": scale,
            "matrix": matrix,
            "bbox_matrix": bbox_matrix,
            "output_shape": image_shape,
        }