Rotation transforms (augmentations.geometric.functional)¶
class RandomRotate90
[view source on GitHub] ¶
Randomly rotate the input by 90 degrees zero or more times.
Parameters:
Name | Type | Description |
---|---|---|
p | probability of applying the transform. Default: 0.5. |
Targets
image, mask, bboxes, keypoints
Image types: uint8, float32
Interactive Tool Available!
Explore this transform visually and adjust parameters interactively using this tool:
Source code in albumentations/augmentations/geometric/rotate.py
class RandomRotate90(DualTransform):
"""Randomly rotate the input by 90 degrees zero or more times.
Args:
p: probability of applying the transform. Default: 0.5.
Targets:
image, mask, bboxes, keypoints
Image types:
uint8, float32
"""
_targets = (Targets.IMAGE, Targets.MASK, Targets.BBOXES, Targets.KEYPOINTS)
def apply(self, img: np.ndarray, factor: int, **params: Any) -> np.ndarray:
return fgeometric.rot90(img, factor)
def get_params(self) -> dict[str, int]:
# Random int in the range [0, 3]
return {"factor": self.py_random.randint(0, 3)}
def apply_to_bboxes(self, bboxes: np.ndarray, factor: int, **params: Any) -> np.ndarray:
return fgeometric.bboxes_rot90(bboxes, factor)
def apply_to_keypoints(self, keypoints: np.ndarray, factor: int, **params: Any) -> np.ndarray:
return fgeometric.keypoints_rot90(keypoints, factor, params["shape"])
def get_transform_init_args_names(self) -> tuple[()]:
return ()
class Rotate
(limit=(-90, 90), interpolation=1, border_mode=4, value=None, mask_value=None, rotate_method='largest_box', crop_border=False, mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None)
[view source on GitHub] ¶
Rotate the input by an angle selected randomly from the uniform distribution.
Parameters:
Name | Type | Description |
---|---|---|
limit | float | tuple[float, float] | Range from which a random angle is picked. If limit is a single float, an angle is picked from (-limit, limit). Default: (-90, 90) |
interpolation | OpenCV flag | Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
border_mode | OpenCV flag | Flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101 |
fill | ColorType | Padding value if border_mode is cv2.BORDER_CONSTANT. |
fill_mask | ColorType | Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks. |
rotate_method | str | Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'. Default: 'largest_box' |
crop_border | bool | Whether to crop border after rotation. If True, the output image size might differ from the input. Default: False |
mask_interpolation | OpenCV flag | flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST. |
p | float | Probability of applying the transform. Default: 0.5. |
Targets
image, mask, bboxes, keypoints
Image types: uint8, float32
Note
- The rotation angle is randomly selected for each execution within the range specified by 'limit'.
- When 'crop_border' is False, the output image will have the same size as the input, potentially introducing black triangles in the corners.
- When 'crop_border' is True, the output image is cropped to remove black triangles, which may result in a smaller image.
- Bounding boxes are rotated and may change size or shape.
- Keypoints are rotated around the center of the image.
Mathematical Details: 1. An angle θ is randomly sampled from the range specified by 'limit'. 2. The image is rotated around its center by θ degrees. 3. The rotation matrix R is: R = [cos(θ) -sin(θ)] [sin(θ) cos(θ)] 4. Each point (x, y) in the image is transformed to (x', y') by: [x'] [cos(θ) -sin(θ)][x - cx] [cx] [y'] = [sin(θ) cos(θ)][y - cy] + [cy] where (cx, cy) is the center of the image. 5. If 'crop_border' is True, the image is cropped to the largest rectangle that fits inside the rotated image.
Examples:
>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.Rotate(limit=45, p=1.0)
>>> result = transform(image=image)
>>> rotated_image = result['image']
# rotated_image will be the input image rotated by a random angle between -45 and 45 degrees
Interactive Tool Available!
Explore this transform visually and adjust parameters interactively using this tool:
Source code in albumentations/augmentations/geometric/rotate.py
class Rotate(DualTransform):
"""Rotate the input by an angle selected randomly from the uniform distribution.
Args:
limit (float | tuple[float, float]): Range from which a random angle is picked. If limit is a single float,
an angle is picked from (-limit, limit). Default: (-90, 90)
interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:
cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.
Default: cv2.INTER_LINEAR.
border_mode (OpenCV flag): Flag that is used to specify the pixel extrapolation method. Should be one of:
cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.
Default: cv2.BORDER_REFLECT_101
fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.
fill_mask (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.
rotate_method (str): Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'.
Default: 'largest_box'
crop_border (bool): Whether to crop border after rotation. If True, the output image size might differ
from the input. Default: False
mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.
Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.
Default: cv2.INTER_NEAREST.
p (float): Probability of applying the transform. Default: 0.5.
Targets:
image, mask, bboxes, keypoints
Image types:
uint8, float32
Note:
- The rotation angle is randomly selected for each execution within the range specified by 'limit'.
- When 'crop_border' is False, the output image will have the same size as the input, potentially
introducing black triangles in the corners.
- When 'crop_border' is True, the output image is cropped to remove black triangles, which may result
in a smaller image.
- Bounding boxes are rotated and may change size or shape.
- Keypoints are rotated around the center of the image.
Mathematical Details:
1. An angle θ is randomly sampled from the range specified by 'limit'.
2. The image is rotated around its center by θ degrees.
3. The rotation matrix R is:
R = [cos(θ) -sin(θ)]
[sin(θ) cos(θ)]
4. Each point (x, y) in the image is transformed to (x', y') by:
[x'] [cos(θ) -sin(θ)] [x - cx] [cx]
[y'] = [sin(θ) cos(θ)] [y - cy] + [cy]
where (cx, cy) is the center of the image.
5. If 'crop_border' is True, the image is cropped to the largest rectangle that fits inside the rotated image.
Example:
>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.Rotate(limit=45, p=1.0)
>>> result = transform(image=image)
>>> rotated_image = result['image']
# rotated_image will be the input image rotated by a random angle between -45 and 45 degrees
"""
_targets = (Targets.IMAGE, Targets.MASK, Targets.BBOXES, Targets.KEYPOINTS)
class InitSchema(RotateInitSchema):
rotate_method: Literal["largest_box", "ellipse"]
crop_border: bool
fill: ColorType
fill_mask: ColorType
value: ColorType | None = Field(default=None, deprecated="Deprecated use fill instead")
mask_value: ColorType | None = Field(default=None, deprecated="Deprecated use fill_mask instead")
@model_validator(mode="after")
def validate_value(self) -> Self:
if self.value is not None:
self.fill = self.value
if self.mask_value is not None:
self.fill_mask = self.mask_value
return self
def __init__(
self,
limit: ScaleFloatType = (-90, 90),
interpolation: int = cv2.INTER_LINEAR,
border_mode: int = cv2.BORDER_REFLECT_101,
value: ColorType | None = None,
mask_value: ColorType | None = None,
rotate_method: Literal["largest_box", "ellipse"] = "largest_box",
crop_border: bool = False,
mask_interpolation: int = cv2.INTER_NEAREST,
fill: ColorType = 0,
fill_mask: ColorType = 0,
p: float = 0.5,
always_apply: bool | None = None,
):
super().__init__(p=p, always_apply=always_apply)
self.limit = cast(tuple[float, float], limit)
self.interpolation = interpolation
self.mask_interpolation = mask_interpolation
self.border_mode = border_mode
self.fill = fill
self.fill_mask = fill_mask
self.rotate_method = rotate_method
self.crop_border = crop_border
def apply(
self,
img: np.ndarray,
matrix: np.ndarray,
x_min: int,
x_max: int,
y_min: int,
y_max: int,
**params: Any,
) -> np.ndarray:
img_out = fgeometric.warp_affine(
img,
matrix,
self.interpolation,
self.fill,
self.border_mode,
params["shape"][:2],
)
if self.crop_border:
return fcrops.crop(img_out, x_min, y_min, x_max, y_max)
return img_out
def apply_to_mask(
self,
mask: np.ndarray,
matrix: np.ndarray,
x_min: int,
x_max: int,
y_min: int,
y_max: int,
**params: Any,
) -> np.ndarray:
img_out = fgeometric.warp_affine(
mask,
matrix,
self.mask_interpolation,
self.fill_mask,
self.border_mode,
params["shape"][:2],
)
if self.crop_border:
return fcrops.crop(img_out, x_min, y_min, x_max, y_max)
return img_out
def apply_to_bboxes(
self,
bboxes: np.ndarray,
bbox_matrix: np.ndarray,
x_min: int,
x_max: int,
y_min: int,
y_max: int,
**params: Any,
) -> np.ndarray:
image_shape = params["shape"][:2]
bboxes_out = fgeometric.bboxes_affine(
bboxes,
bbox_matrix,
self.rotate_method,
image_shape,
self.border_mode,
image_shape,
)
if self.crop_border:
return fcrops.crop_bboxes_by_coords(bboxes_out, (x_min, y_min, x_max, y_max), image_shape)
return bboxes_out
def apply_to_keypoints(
self,
keypoints: np.ndarray,
matrix: np.ndarray,
x_min: int,
x_max: int,
y_min: int,
y_max: int,
**params: Any,
) -> np.ndarray:
keypoints_out = fgeometric.keypoints_affine(
keypoints,
matrix,
params["shape"][:2],
scale={"x": 1, "y": 1},
border_mode=self.border_mode,
)
if self.crop_border:
return fcrops.crop_keypoints_by_coords(keypoints_out, (x_min, y_min, x_max, y_max))
return keypoints_out
@staticmethod
def _rotated_rect_with_max_area(height: int, width: int, angle: float) -> dict[str, int]:
"""Given a rectangle of size wxh that has been rotated by 'angle' (in
degrees), computes the width and height of the largest possible
axis-aligned rectangle (maximal area) within the rotated rectangle.
Reference:
https://stackoverflow.com/questions/16702966/rotate-image-and-crop-out-black-borders
"""
angle = math.radians(angle)
width_is_longer = width >= height
side_long, side_short = (width, height) if width_is_longer else (height, width)
# since the solutions for angle, -angle and 180-angle are all the same,
# it is sufficient to look at the first quadrant and the absolute values of sin,cos:
sin_a, cos_a = abs(math.sin(angle)), abs(math.cos(angle))
if side_short <= 2.0 * sin_a * cos_a * side_long or abs(sin_a - cos_a) < SMALL_NUMBER:
# half constrained case: two crop corners touch the longer side,
# the other two corners are on the mid-line parallel to the longer line
x = 0.5 * side_short
wr, hr = (x / sin_a, x / cos_a) if width_is_longer else (x / cos_a, x / sin_a)
else:
# fully constrained case: crop touches all 4 sides
cos_2a = cos_a * cos_a - sin_a * sin_a
wr, hr = (width * cos_a - height * sin_a) / cos_2a, (height * cos_a - width * sin_a) / cos_2a
return {
"x_min": max(0, int(width / 2 - wr / 2)),
"x_max": min(width, int(width / 2 + wr / 2)),
"y_min": max(0, int(height / 2 - hr / 2)),
"y_max": min(height, int(height / 2 + hr / 2)),
}
def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -> dict[str, Any]:
angle = self.py_random.uniform(*self.limit)
if self.crop_border:
height, width = params["shape"][:2]
out_params = self._rotated_rect_with_max_area(height, width, angle)
else:
out_params = {"x_min": -1, "x_max": -1, "y_min": -1, "y_max": -1}
center = fgeometric.center(params["shape"][:2])
bbox_center = fgeometric.center_bbox(params["shape"][:2])
translate: fgeometric.XYInt = {"x": 0, "y": 0}
shear: fgeometric.XYFloat = {"x": 0, "y": 0}
scale: fgeometric.XYFloat = {"x": 1, "y": 1}
rotate = angle
matrix = fgeometric.create_affine_transformation_matrix(translate, shear, scale, rotate, center)
bbox_matrix = fgeometric.create_affine_transformation_matrix(translate, shear, scale, rotate, bbox_center)
out_params["matrix"] = matrix
out_params["bbox_matrix"] = bbox_matrix
return out_params
def get_transform_init_args_names(self) -> tuple[str, ...]:
return (
"limit",
"interpolation",
"border_mode",
"fill",
"fill_mask",
"rotate_method",
"crop_border",
"mask_interpolation",
)
class RotateAndProject
(x_angle_range=(-15, 15), y_angle_range=(-15, 15), z_angle_range=(-15, 15), focal_range=(0.5, 1.5), border_mode=0, fill=0, fill_mask=0, interpolation=1, mask_interpolation=0, p=0.5, always_apply=None)
[view source on GitHub] ¶
Applies 3D rotation to an image and projects it back to 2D plane using perspective projection.
This transform simulates viewing a 2D image from different 3D viewpoints by: 1. Rotating the image around three axes (X, Y, Z) in 3D space 2. Applying perspective projection to map the rotated image back to 2D 3. Handling different center calculations for images/keypoints and bounding boxes
The transform preserves aspect ratios and handles all target types (images, masks, keypoints, and bounding boxes) consistently.
Parameters:
Name | Type | Description |
---|---|---|
x_angle_range | tuple[float, float] | Range for rotation around x-axis in degrees. Positive angles rotate the top edge away from viewer. Default: (-15, 15) |
y_angle_range | tuple[float, float] | Range for rotation around y-axis in degrees. Positive angles rotate the right edge away from viewer. Default: (-15, 15) |
z_angle_range | tuple[float, float] | Range for rotation around z-axis in degrees. Positive angles rotate clockwise in image plane. Default: (-15, 15) |
focal_range | tuple[float, float] | Range for focal length of perspective projection. Controls the strength of perspective effect: - Values < 1.0: Strong perspective (wide-angle lens effect) - Value = 1.0: Normal perspective - Values > 1.0: Weak perspective (telephoto lens effect) Default: (0.5, 1.5) |
border_mode | OpenCV flag | Padding mode for borders after rotation. Should be one of: - cv2.BORDER_CONSTANT: pads with constant value - cv2.BORDER_REFLECT: reflects border pixels - cv2.BORDER_REFLECT_101: reflects border pixels without duplicating edge pixels - cv2.BORDER_REPLICATE: replicates border pixels Default: cv2.BORDER_CONSTANT |
fill | ColorType | Padding value if border_mode is cv2.BORDER_CONSTANT. Default: 0 |
fill_mask | ColorType | Padding value for masks if border_mode is cv2.BORDER_CONSTANT. Default: 0 |
interpolation | OpenCV flag | Interpolation method for image transformation. Should be one of: - cv2.INTER_NEAREST: nearest-neighbor interpolation - cv2.INTER_LINEAR: bilinear interpolation - cv2.INTER_CUBIC: bicubic interpolation Default: cv2.INTER_LINEAR |
mask_interpolation | OpenCV flag | Interpolation method for mask transformation. Default: cv2.INTER_NEAREST |
p | float | Probability of applying the transform. Default: 0.5 |
Targets
image, mask, keypoints, bboxes
Image types: uint8, float32
Note
- The transform maintains original image size
- Uses different center calculations for images/keypoints (width-1)/2 vs bboxes width/2
- Handles all coordinate transformations in homogeneous coordinates
- Applies proper perspective transformation to bounding boxes by transforming corners
Examples:
>>> import albumentations as A
>>> transform = A.RotateAndProject(
... x_angle_range=(-30, 30),
... y_angle_range=(-30, 30),
... z_angle_range=(-15, 15),
... focal_range=(0.7, 1.3),
... p=1.0
... )
>>> result = transform(image=image, bboxes=bboxes, keypoints=keypoints)
Interactive Tool Available!
Explore this transform visually and adjust parameters interactively using this tool:
Source code in albumentations/augmentations/geometric/rotate.py
class RotateAndProject(Perspective):
"""Applies 3D rotation to an image and projects it back to 2D plane using perspective projection.
This transform simulates viewing a 2D image from different 3D viewpoints by:
1. Rotating the image around three axes (X, Y, Z) in 3D space
2. Applying perspective projection to map the rotated image back to 2D
3. Handling different center calculations for images/keypoints and bounding boxes
The transform preserves aspect ratios and handles all target types (images, masks,
keypoints, and bounding boxes) consistently.
Args:
x_angle_range (tuple[float, float]): Range for rotation around x-axis in degrees.
Positive angles rotate the top edge away from viewer.
Default: (-15, 15)
y_angle_range (tuple[float, float]): Range for rotation around y-axis in degrees.
Positive angles rotate the right edge away from viewer.
Default: (-15, 15)
z_angle_range (tuple[float, float]): Range for rotation around z-axis in degrees.
Positive angles rotate clockwise in image plane.
Default: (-15, 15)
focal_range (tuple[float, float]): Range for focal length of perspective projection.
Controls the strength of perspective effect:
- Values < 1.0: Strong perspective (wide-angle lens effect)
- Value = 1.0: Normal perspective
- Values > 1.0: Weak perspective (telephoto lens effect)
Default: (0.5, 1.5)
border_mode (OpenCV flag): Padding mode for borders after rotation.
Should be one of:
- cv2.BORDER_CONSTANT: pads with constant value
- cv2.BORDER_REFLECT: reflects border pixels
- cv2.BORDER_REFLECT_101: reflects border pixels without duplicating edge pixels
- cv2.BORDER_REPLICATE: replicates border pixels
Default: cv2.BORDER_CONSTANT
fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.
Default: 0
fill_mask (ColorType): Padding value for masks if border_mode is cv2.BORDER_CONSTANT.
Default: 0
interpolation (OpenCV flag): Interpolation method for image transformation.
Should be one of:
- cv2.INTER_NEAREST: nearest-neighbor interpolation
- cv2.INTER_LINEAR: bilinear interpolation
- cv2.INTER_CUBIC: bicubic interpolation
Default: cv2.INTER_LINEAR
mask_interpolation (OpenCV flag): Interpolation method for mask transformation.
Default: cv2.INTER_NEAREST
p (float): Probability of applying the transform.
Default: 0.5
Targets:
image, mask, keypoints, bboxes
Image types:
uint8, float32
Note:
- The transform maintains original image size
- Uses different center calculations for images/keypoints (width-1)/2 vs bboxes width/2
- Handles all coordinate transformations in homogeneous coordinates
- Applies proper perspective transformation to bounding boxes by transforming corners
Example:
>>> import albumentations as A
>>> transform = A.RotateAndProject(
... x_angle_range=(-30, 30),
... y_angle_range=(-30, 30),
... z_angle_range=(-15, 15),
... focal_range=(0.7, 1.3),
... p=1.0
... )
>>> result = transform(image=image, bboxes=bboxes, keypoints=keypoints)
"""
class InitSchema(BaseTransformInitSchema):
x_angle_range: Annotated[tuple[float, float], AfterValidator(nondecreasing)]
y_angle_range: Annotated[tuple[float, float], AfterValidator(nondecreasing)]
z_angle_range: Annotated[tuple[float, float], AfterValidator(nondecreasing)]
focal_range: Annotated[tuple[float, float], AfterValidator(nondecreasing)]
mask_interpolation: InterpolationType
interpolation: InterpolationType
border_mode: int
fill: ColorType
fill_mask: ColorType
def __init__(
self,
x_angle_range: tuple[float, float] = (-15, 15),
y_angle_range: tuple[float, float] = (-15, 15),
z_angle_range: tuple[float, float] = (-15, 15),
focal_range: tuple[float, float] = (0.5, 1.5),
border_mode: int = cv2.BORDER_CONSTANT,
fill: ColorType = 0,
fill_mask: ColorType = 0,
interpolation: int = cv2.INTER_LINEAR,
mask_interpolation: int = cv2.INTER_NEAREST,
p: float = 0.5,
always_apply: bool | None = None,
):
super().__init__(
scale=(0, 0), # Unused but required by parent
keep_size=True,
border_mode=border_mode,
fill=fill,
fill_mask=fill_mask,
interpolation=interpolation,
mask_interpolation=mask_interpolation,
p=p,
)
self.x_angle_range = x_angle_range
self.y_angle_range = y_angle_range
self.z_angle_range = z_angle_range
self.focal_range = focal_range
self.fill = fill
self.fill_mask = fill_mask
self.interpolation = interpolation
self.mask_interpolation = mask_interpolation
def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -> dict[str, Any]:
image_shape = params["shape"][:2]
height, width = image_shape
# Sample parameters
x_angle = np.deg2rad(self.py_random.uniform(*self.x_angle_range))
y_angle = np.deg2rad(self.py_random.uniform(*self.y_angle_range))
z_angle = np.deg2rad(self.py_random.uniform(*self.z_angle_range))
focal_length = self.py_random.uniform(*self.focal_range)
# Get projection matrix
matrix = fgeometric.get_projection_matrix(
image_shape,
x_angle,
y_angle,
z_angle,
focal_length,
fgeometric.center(image_shape),
)
matrix_bbox = fgeometric.get_projection_matrix(
image_shape,
x_angle,
y_angle,
z_angle,
focal_length,
fgeometric.center_bbox(image_shape),
)
return {"matrix": matrix, "max_height": height, "max_width": width, "matrix_bbox": matrix_bbox}
def get_transform_init_args_names(self) -> tuple[str, ...]:
return (
"x_angle_range",
"y_angle_range",
"z_angle_range",
"focal_range",
"border_mode",
"fill",
"fill_mask",
"interpolation",
"mask_interpolation",
)
def apply_to_bboxes(
self,
bboxes: np.ndarray,
matrix_bbox: np.ndarray,
max_height: int,
max_width: int,
**params: Any,
) -> np.ndarray:
return fgeometric.perspective_bboxes(bboxes, params["shape"], matrix_bbox, max_width, max_height, True)
class RotateInitSchema
¶
Interactive Tool Available!
Explore this transform visually and adjust parameters interactively using this tool:
class SafeRotate
(limit=(-90, 90), interpolation=1, border_mode=4, value=None, mask_value=None, rotate_method='largest_box', mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None)
[view source on GitHub] ¶
Rotate the input inside the input's frame by an angle selected randomly from the uniform distribution.
This transformation ensures that the entire rotated image fits within the original frame by scaling it down if necessary. The resulting image maintains its original dimensions but may contain artifacts due to the rotation and scaling process.
Parameters:
Name | Type | Description |
---|---|---|
limit | float | tuple[float, float] | Range from which a random angle is picked. If limit is a single float, an angle is picked from (-limit, limit). Default: (-90, 90) |
interpolation | OpenCV flag | Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR. |
border_mode | OpenCV flag | Flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101 |
fill | ColorType | Padding value if border_mode is cv2.BORDER_CONSTANT. |
fill_mask | ColorType | Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks. |
rotate_method | Literal["largest_box", "ellipse"] | Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'. Default: 'largest_box' |
mask_interpolation | OpenCV flag | flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST. |
p | float | Probability of applying the transform. Default: 0.5. |
Targets
image, mask, bboxes, keypoints
Image types: uint8, float32
Note
- The rotation is performed around the center of the image.
- After rotation, the image is scaled to fit within the original frame, which may cause some distortion.
- The output image will always have the same dimensions as the input image.
- Bounding boxes and keypoints are transformed along with the image.
Mathematical Details: 1. An angle θ is randomly sampled from the range specified by 'limit'. 2. The image is rotated around its center by θ degrees. 3. The rotation matrix R is: R = [cos(θ) -sin(θ)] [sin(θ) cos(θ)] 4. The scaling factor s is calculated to ensure the rotated image fits within the original frame: s = min(width / (width * |cos(θ)| + height * |sin(θ)|), height / (width * |sin(θ)| + height * |cos(θ)|)) 5. The combined transformation matrix T is: T = [scos(θ) -ssin(θ) tx] [ssin(θ) scos(θ) ty] where tx and ty are translation factors to keep the image centered. 6. Each point (x, y) in the image is transformed to (x', y') by: [x'] [scos(θ) ssin(θ)][x - cx] [cx] [y'] = [-ssin(θ) scos(θ)][y - cy] + [cy] where (cx, cy) is the center of the image.
Examples:
>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.SafeRotate(limit=45, p=1.0)
>>> result = transform(image=image)
>>> rotated_image = result['image']
# rotated_image will be the input image rotated by a random angle between -45 and 45 degrees,
# scaled to fit within the original 100x100 frame
Interactive Tool Available!
Explore this transform visually and adjust parameters interactively using this tool:
Source code in albumentations/augmentations/geometric/rotate.py
class SafeRotate(Affine):
"""Rotate the input inside the input's frame by an angle selected randomly from the uniform distribution.
This transformation ensures that the entire rotated image fits within the original frame by scaling it
down if necessary. The resulting image maintains its original dimensions but may contain artifacts due to the
rotation and scaling process.
Args:
limit (float | tuple[float, float]): Range from which a random angle is picked. If limit is a single float,
an angle is picked from (-limit, limit). Default: (-90, 90)
interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:
cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.
Default: cv2.INTER_LINEAR.
border_mode (OpenCV flag): Flag that is used to specify the pixel extrapolation method. Should be one of:
cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.
Default: cv2.BORDER_REFLECT_101
fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.
fill_mask (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT applied
for masks.
rotate_method (Literal["largest_box", "ellipse"]): Method to rotate bounding boxes.
Should be 'largest_box' or 'ellipse'. Default: 'largest_box'
mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.
Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.
Default: cv2.INTER_NEAREST.
p (float): Probability of applying the transform. Default: 0.5.
Targets:
image, mask, bboxes, keypoints
Image types:
uint8, float32
Note:
- The rotation is performed around the center of the image.
- After rotation, the image is scaled to fit within the original frame, which may cause some distortion.
- The output image will always have the same dimensions as the input image.
- Bounding boxes and keypoints are transformed along with the image.
Mathematical Details:
1. An angle θ is randomly sampled from the range specified by 'limit'.
2. The image is rotated around its center by θ degrees.
3. The rotation matrix R is:
R = [cos(θ) -sin(θ)]
[sin(θ) cos(θ)]
4. The scaling factor s is calculated to ensure the rotated image fits within the original frame:
s = min(width / (width * |cos(θ)| + height * |sin(θ)|),
height / (width * |sin(θ)| + height * |cos(θ)|))
5. The combined transformation matrix T is:
T = [s*cos(θ) -s*sin(θ) tx]
[s*sin(θ) s*cos(θ) ty]
where tx and ty are translation factors to keep the image centered.
6. Each point (x, y) in the image is transformed to (x', y') by:
[x'] [s*cos(θ) s*sin(θ)] [x - cx] [cx]
[y'] = [-s*sin(θ) s*cos(θ)] [y - cy] + [cy]
where (cx, cy) is the center of the image.
Example:
>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.SafeRotate(limit=45, p=1.0)
>>> result = transform(image=image)
>>> rotated_image = result['image']
# rotated_image will be the input image rotated by a random angle between -45 and 45 degrees,
# scaled to fit within the original 100x100 frame
"""
_targets = (Targets.IMAGE, Targets.MASK, Targets.BBOXES, Targets.KEYPOINTS)
class InitSchema(RotateInitSchema):
rotate_method: Literal["largest_box", "ellipse"]
def __init__(
self,
limit: ScaleFloatType = (-90, 90),
interpolation: int = cv2.INTER_LINEAR,
border_mode: int = cv2.BORDER_REFLECT_101,
value: ColorType | None = None,
mask_value: ColorType | None = None,
rotate_method: Literal["largest_box", "ellipse"] = "largest_box",
mask_interpolation: int = cv2.INTER_NEAREST,
fill: ColorType = 0,
fill_mask: ColorType = 0,
p: float = 0.5,
always_apply: bool | None = None,
):
super().__init__(
rotate=limit,
interpolation=interpolation,
border_mode=border_mode,
fill=fill,
fill_mask=fill_mask,
rotate_method=rotate_method,
fit_output=True,
mask_interpolation=mask_interpolation,
p=p,
)
self.limit = cast(tuple[float, float], limit)
self.interpolation = interpolation
self.border_mode = border_mode
self.fill = fill
self.fill_mask = fill_mask
self.rotate_method = rotate_method
self.mask_interpolation = mask_interpolation
def get_transform_init_args_names(self) -> tuple[str, ...]:
return (
"limit",
"interpolation",
"border_mode",
"fill",
"fill_mask",
"rotate_method",
"mask_interpolation",
)
def _create_safe_rotate_matrix(
self,
angle: float,
center: tuple[float, float],
image_shape: tuple[int, int],
) -> tuple[np.ndarray, dict[str, float]]:
height, width = image_shape[:2]
rotation_mat = cv2.getRotationMatrix2D(center, angle, 1.0)
# Calculate new image size
abs_cos = abs(rotation_mat[0, 0])
abs_sin = abs(rotation_mat[0, 1])
new_w = int(height * abs_sin + width * abs_cos)
new_h = int(height * abs_cos + width * abs_sin)
# Adjust the rotation matrix to take into account the new size
rotation_mat[0, 2] += new_w / 2 - center[0]
rotation_mat[1, 2] += new_h / 2 - center[1]
# Calculate scaling factors
scale_x = width / new_w
scale_y = height / new_h
# Create scaling matrix
scale_mat = np.array([[scale_x, 0, 0], [0, scale_y, 0], [0, 0, 1]])
# Combine rotation and scaling
matrix = scale_mat @ np.vstack([rotation_mat, [0, 0, 1]])
return matrix, {"x": scale_x, "y": scale_y}
def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -> dict[str, Any]:
image_shape = params["shape"][:2]
angle = self.py_random.uniform(*self.limit)
# Calculate centers for image and bbox
image_center = fgeometric.center(image_shape)
bbox_center = fgeometric.center_bbox(image_shape)
# Create matrices for image and bbox
matrix, scale = self._create_safe_rotate_matrix(angle, image_center, image_shape)
bbox_matrix, _ = self._create_safe_rotate_matrix(angle, bbox_center, image_shape)
return {
"rotate": angle,
"scale": scale,
"matrix": matrix,
"bbox_matrix": bbox_matrix,
"output_shape": image_shape,
}