albumentations.augmentations.geometric.flip
Applies one of the eight possible D4 dihedral group transformations to a square-shaped input, maintaining the square shape. These transformations correspond to the symmetries of a square, including rotations and reflections.
Members
- classD4
- classHorizontalFlip
- classSquareSymmetry
- classTranspose
- classVerticalFlip
D4class
D4(
p: float = 1,
group_element: 'e' | 'r90' | 'r180' | 'r270' | 'v' | 'hvt' | 'h' | 't' | None
)Applies one of the eight possible D4 dihedral group transformations to a square-shaped input, maintaining the square shape. These transformations correspond to the symmetries of a square, including rotations and reflections. The D4 group transformations include: - 'e' (identity): No transformation is applied. - 'r90' (rotation by 90 degrees counterclockwise) - 'r180' (rotation by 180 degrees) - 'r270' (rotation by 270 degrees counterclockwise) - 'v' (reflection across the vertical midline) - 'hvt' (reflection across the anti-diagonal) - 'h' (reflection across the horizontal midline) - 't' (reflection across the main diagonal) Even if the probability (`p`) of applying the transform is set to 1, the identity transformation 'e' may still occur, which means the input will remain unchanged in one out of eight cases. When `group_element` is specified, the transform is deterministic—useful for TTA (Test Time Augmentation) where you need to apply each of the 8 symmetries explicitly and invert predictions. Call `inverse()` on a deterministic instance to get a new transform that undoes the operation.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| p | float | 1 | Probability of applying the transform. Default: 1.0. |
| group_element | One of:
| - | If set, always apply this specific D4 group element instead of sampling randomly. Use for TTA. Default: None (random choice). |
Examples
>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.Compose([
... A.D4(p=1.0),
... ])
>>> transformed = transform(image=image)
>>> transformed_image = transformed['image']
# The resulting image will be one of the 8 possible D4 transformations of the input
>>> # TTA: apply each D4 symmetry, run inference, then undo the transform on the prediction
>>> from albumentations.core.type_definitions import d4_group_elements
>>> predictions = []
>>> for element in d4_group_elements:
... aug = A.D4(p=1.0, group_element=element)
... aug_image = aug(image=image)["image"]
... pred_mask = np.zeros((100, 100, 1), dtype=np.uint8) # placeholder for model output
... restored = aug.inverse()(image=pred_mask)["image"]
... predictions.append(restored)Notes
- This transform is particularly useful for augmenting data that does not have a clear orientation, such as top-view satellite or drone imagery, or certain types of medical images. - The input image should be square-shaped for optimal results. Non-square inputs may lead to unexpected behavior or distortions. - When applied to bounding boxes or keypoints, their coordinates will be adjusted according to the selected transformation. - This transform preserves the aspect ratio and size of the input. - `inverse()` requires `group_element` to be set explicitly; raises `ValueError` otherwise.
HorizontalFlipclass
HorizontalFlip(
p: float = 0.5
)Flip the input horizontally around the y-axis.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| p | float | 0.5 | probability of applying the transform. Default: 0.5. |
Examples
>>> import numpy as np
>>> import albumentations as A
>>>
>>> # Prepare sample data
>>> image = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
>>> mask = np.array([[1, 0], [0, 1]])
>>> bboxes = np.array([[0.1, 0.5, 0.3, 0.9]]) # [x_min, y_min, x_max, y_max] format
>>> keypoints = np.array([[0.1, 0.5], [0.9, 0.5]]) # [x, y] format
>>>
>>> # Create a transform with horizontal flip
>>> transform = A.Compose([
... A.HorizontalFlip(p=1.0) # Always apply for this example
... ], bbox_params=A.BboxParams(coord_format='yolo', label_fields=[]),
... keypoint_params=A.KeypointParams(coord_format='normalized'))
>>>
>>> # Apply the transform
>>> transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)
>>>
>>> # Get the transformed data
>>> flipped_image = transformed["image"] # Image flipped horizontally
>>> flipped_mask = transformed["mask"] # Mask flipped horizontally
>>> flipped_bboxes = transformed["bboxes"] # BBox coordinates adjusted for horizontal flip
>>> flipped_keypoints = transformed["keypoints"] # Keypoint x-coordinates flipped
>>> # TTA: flip, run inference, unflip the predicted mask
>>> aug = A.HorizontalFlip(p=1)
>>> aug_image = aug(image=image)["image"]
>>> pred_mask = np.zeros_like(image[..., :1]) # placeholder for model output
>>> restored_mask = aug.inverse()(image=pred_mask)["image"]Notes
- This transform is self-inverse: applying it twice returns the original image. Call `inverse()` to get a new instance that undoes the flip (identical to applying the flip again), useful for TTA pipelines.
SquareSymmetryclass
SquareSymmetry(
p: float = 1,
group_element: 'e' | 'r90' | 'r180' | 'r270' | 'v' | 'hvt' | 'h' | 't' | None
)Applies one of the eight possible square symmetry transformations to a square-shaped input. This is an alias for D4 transform with a more intuitive name for those not familiar with group theory. The square symmetry transformations include: - Identity: No transformation is applied - 90° rotation: Rotate 90 degrees counterclockwise - 180° rotation: Rotate 180 degrees - 270° rotation: Rotate 270 degrees counterclockwise - Vertical flip: Mirror across vertical axis - Anti-diagonal flip: Mirror across anti-diagonal - Horizontal flip: Mirror across horizontal axis - Main diagonal flip: Mirror across main diagonal When `group_element` is specified, the transform is deterministic—useful for TTA (Test Time Augmentation) where you need to apply each of the 8 symmetries explicitly and invert predictions. Call `inverse()` on a deterministic instance to get a new transform that undoes the operation.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| p | float | 1 | Probability of applying the transform. Default: 1.0. |
| group_element | One of:
| - | If set, always apply this specific D4 group element instead of sampling randomly. Use for TTA. Default: None (random choice). |
Examples
>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.Compose([
... A.SquareSymmetry(p=1.0),
... ])
>>> transformed = transform(image=image)
>>> transformed_image = transformed['image']
# The resulting image will be one of the 8 possible square symmetry transformations of the input
>>> # TTA: apply each symmetry, run inference, then undo the transform on the prediction
>>> from albumentations.core.type_definitions import d4_group_elements
>>> predictions = []
>>> for element in d4_group_elements:
... aug = A.SquareSymmetry(p=1.0, group_element=element)
... aug_image = aug(image=image)["image"]
... pred_mask = np.zeros((100, 100, 1), dtype=np.uint8) # placeholder for model output
... restored = aug.inverse()(image=pred_mask)["image"]
... predictions.append(restored)Notes
- This transform is particularly useful for augmenting data that does not have a clear orientation, such as top-view satellite or drone imagery, or certain types of medical images. - The input image should be square-shaped for optimal results. Non-square inputs may lead to unexpected behavior or distortions. - When applied to bounding boxes or keypoints, their coordinates will be adjusted according to the selected transformation. - This transform preserves the aspect ratio and size of the input. - `inverse()` requires `group_element` to be set explicitly; raises `ValueError` otherwise.
Transposeclass
Transpose(
p: float = 0.5
)Transpose the input by swapping its rows and columns. This transform flips the image over its main diagonal, effectively switching its width and height. It's equivalent to a 90-degree rotation followed by a horizontal flip.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| p | float | 0.5 | Probability of applying the transform. Default: 0.5. |
Examples
>>> import numpy as np
>>> import albumentations as A
>>> image = np.array([
... [[1, 2, 3], [4, 5, 6]],
... [[7, 8, 9], [10, 11, 12]]
... ])
>>> transform = A.Transpose(p=1.0)
>>> result = transform(image=image)
>>> transposed_image = result['image']
>>> print(transposed_image)
[[[ 1 2 3]
[ 7 8 9]]
[[ 4 5 6]
[10 11 12]]]
# The original 2x2x3 image is now 2x2x3, with rows and columns swapped
>>> # TTA: transpose, run inference, un-transpose the predicted mask
>>> aug = A.Transpose(p=1)
>>> aug_image = aug(image=image)["image"]
>>> pred_mask = np.zeros((aug_image.shape[0], aug_image.shape[1], 1), dtype=np.uint8)
>>> restored_mask = aug.inverse()(image=pred_mask)["image"]Notes
- The dimensions of the output will be swapped compared to the input. For example, an input image of shape (100, 200, 3) will result in an output of shape (200, 100, 3). - This transform is self-inverse: applying it twice returns the original image. Call `inverse()` to get a new instance that undoes the transpose (identical to applying it again), useful for TTA pipelines. - For multi-channel images (like RGB), the channels are preserved in their original order. - Bounding boxes will have their coordinates adjusted to match the new image dimensions. - Keypoints will have their x and y coordinates swapped.
VerticalFlipclass
VerticalFlip(
p: float = 0.5
)Flip the input vertically around the x-axis.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| p | float | 0.5 | Probability of applying the transform. Default: 0.5. |
Examples
>>> import numpy as np
>>> import albumentations as A
>>> image = np.array([
... [[1, 2, 3], [4, 5, 6]],
... [[7, 8, 9], [10, 11, 12]]
... ])
>>> transform = A.VerticalFlip(p=1.0)
>>> result = transform(image=image)
>>> flipped_image = result['image']
>>> print(flipped_image)
[[[ 7 8 9]
[10 11 12]]
[[ 1 2 3]
[ 4 5 6]]]
# The original image is flipped vertically, with rows reversed
>>> # TTA: flip, run inference, unflip the predicted mask
>>> aug = A.VerticalFlip(p=1)
>>> aug_image = aug(image=image)["image"]
>>> pred_mask = np.zeros_like(image[..., :1]) # placeholder for model output
>>> restored_mask = aug.inverse()(image=pred_mask)["image"]Notes
- This transform flips the image upside down. The top of the image becomes the bottom and vice versa. - The dimensions of the image remain unchanged. - For multi-channel images (like RGB), each channel is flipped independently. - Bounding boxes are adjusted to match their new positions in the flipped image. - Keypoints are moved to their new positions in the flipped image. - This transform is self-inverse: applying it twice returns the original image. Call `inverse()` to get a new instance that undoes the flip (which is identical to applying the flip again), useful for TTA pipelines.