Your ad could be here - Reach CV/ML engineers
Contact for advertisingContactInterested in advertising?
Contact usStay updated
News & Insightsalbumentations.augmentations.geometric.flip
Geometric transformations for flip and symmetry operations. This module contains transforms that apply various flip and symmetry operations to images and other target types. These transforms modify the geometric arrangement of the input data while preserving the pixel values themselves. Available transforms: - VerticalFlip: Flips the input upside down (around the x-axis) - HorizontalFlip: Flips the input left to right (around the y-axis) - Transpose: Swaps rows and columns (flips around the main diagonal) - D4: Applies one of eight possible square symmetry transformations (dihedral group D4) - SquareSymmetry: Alias for D4 with a more intuitive name These transforms are particularly useful for: - Data augmentation to improve model generalization - Addressing orientation biases in training data - Working with data that doesn't have a natural orientation (e.g., satellite imagery) - Exploiting symmetries in the problem domain All transforms support various target types including images, masks, bounding boxes, keypoints, volumes, and 3D masks, ensuring consistent transformation across different data modalities.
Members
- classD4
- classHorizontalFlip
- classSquareSymmetry
- classTranspose
- classVerticalFlip
D4class
Applies one of the eight possible D4 dihedral group transformations to a square-shaped input, maintaining the square shape. These transformations correspond to the symmetries of a square, including rotations and reflections. The D4 group transformations include: - 'e' (identity): No transformation is applied. - 'r90' (rotation by 90 degrees counterclockwise) - 'r180' (rotation by 180 degrees) - 'r270' (rotation by 270 degrees counterclockwise) - 'v' (reflection across the vertical midline) - 'hvt' (reflection across the anti-diagonal) - 'h' (reflection across the horizontal midline) - 't' (reflection across the main diagonal) Even if the probability (`p`) of applying the transform is set to 1, the identity transformation 'e' may still occur, which means the input will remain unchanged in one out of eight cases.
Parameters
Name | Type | Default | Description |
---|---|---|---|
p | float | 1 | Probability of applying the transform. Default: 1.0. |
Examples
>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.Compose([
... A.D4(p=1.0),
... ])
>>> transformed = transform(image=image)
>>> transformed_image = transformed['image']
# The resulting image will be one of the 8 possible D4 transformations of the input
Notes
- This transform is particularly useful for augmenting data that does not have a clear orientation, such as top-view satellite or drone imagery, or certain types of medical images. - The input image should be square-shaped for optimal results. Non-square inputs may lead to unexpected behavior or distortions. - When applied to bounding boxes or keypoints, their coordinates will be adjusted according to the selected transformation. - This transform preserves the aspect ratio and size of the input.
HorizontalFlipclass
Flip the input horizontally around the y-axis.
Parameters
Name | Type | Default | Description |
---|---|---|---|
p | float | 0.5 | probability of applying the transform. Default: 0.5. |
Examples
>>> import numpy as np
>>> import albumentations as A
>>>
>>> # Prepare sample data
>>> image = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
>>> mask = np.array([[1, 0], [0, 1]])
>>> bboxes = np.array([[0.1, 0.5, 0.3, 0.9]]) # [x_min, y_min, x_max, y_max] format
>>> keypoints = np.array([[0.1, 0.5], [0.9, 0.5]]) # [x, y] format
>>>
>>> # Create a transform with horizontal flip
>>> transform = A.Compose([
... A.HorizontalFlip(p=1.0) # Always apply for this example
... ], bbox_params=A.BboxParams(format='yolo', label_fields=[]),
... keypoint_params=A.KeypointParams(format='normalized'))
>>>
>>> # Apply the transform
>>> transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)
>>>
>>> # Get the transformed data
>>> flipped_image = transformed["image"] # Image flipped horizontally
>>> flipped_mask = transformed["mask"] # Mask flipped horizontally
>>> flipped_bboxes = transformed["bboxes"] # BBox coordinates adjusted for horizontal flip
>>> flipped_keypoints = transformed["keypoints"] # Keypoint x-coordinates flipped
SquareSymmetryclass
Applies one of the eight possible square symmetry transformations to a square-shaped input. This is an alias for D4 transform with a more intuitive name for those not familiar with group theory. The square symmetry transformations include: - Identity: No transformation is applied - 90° rotation: Rotate 90 degrees counterclockwise - 180° rotation: Rotate 180 degrees - 270° rotation: Rotate 270 degrees counterclockwise - Vertical flip: Mirror across vertical axis - Anti-diagonal flip: Mirror across anti-diagonal - Horizontal flip: Mirror across horizontal axis - Main diagonal flip: Mirror across main diagonal
Parameters
Name | Type | Default | Description |
---|---|---|---|
p | float | 1 | Probability of applying the transform. Default: 1.0. |
Examples
>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.Compose([
... A.SquareSymmetry(p=1.0),
... ])
>>> transformed = transform(image=image)
>>> transformed_image = transformed['image']
# The resulting image will be one of the 8 possible square symmetry transformations of the input
Notes
- This transform is particularly useful for augmenting data that does not have a clear orientation, such as top-view satellite or drone imagery, or certain types of medical images. - The input image should be square-shaped for optimal results. Non-square inputs may lead to unexpected behavior or distortions. - When applied to bounding boxes or keypoints, their coordinates will be adjusted according to the selected transformation. - This transform preserves the aspect ratio and size of the input.
Transposeclass
Transpose the input by swapping its rows and columns. This transform flips the image over its main diagonal, effectively switching its width and height. It's equivalent to a 90-degree rotation followed by a horizontal flip.
Parameters
Name | Type | Default | Description |
---|---|---|---|
p | float | 0.5 | Probability of applying the transform. Default: 0.5. |
Examples
>>> import numpy as np
>>> import albumentations as A
>>> image = np.array([
... [[1, 2, 3], [4, 5, 6]],
... [[7, 8, 9], [10, 11, 12]]
... ])
>>> transform = A.Transpose(p=1.0)
>>> result = transform(image=image)
>>> transposed_image = result['image']
>>> print(transposed_image)
[[[ 1 2 3]
[ 7 8 9]]
[[ 4 5 6]
[10 11 12]]]
# The original 2x2x3 image is now 2x2x3, with rows and columns swapped
Notes
- The dimensions of the output will be swapped compared to the input. For example, an input image of shape (100, 200, 3) will result in an output of shape (200, 100, 3). - This transform is its own inverse. Applying it twice will return the original input. - For multi-channel images (like RGB), the channels are preserved in their original order. - Bounding boxes will have their coordinates adjusted to match the new image dimensions. - Keypoints will have their x and y coordinates swapped.
VerticalFlipclass
Flip the input vertically around the x-axis.
Parameters
Name | Type | Default | Description |
---|---|---|---|
p | float | 0.5 | Probability of applying the transform. Default: 0.5. |
Examples
>>> import numpy as np
>>> import albumentations as A
>>> image = np.array([
... [[1, 2, 3], [4, 5, 6]],
... [[7, 8, 9], [10, 11, 12]]
... ])
>>> transform = A.VerticalFlip(p=1.0)
>>> result = transform(image=image)
>>> flipped_image = result['image']
>>> print(flipped_image)
[[[ 7 8 9]
[10 11 12]]
[[ 1 2 3]
[ 4 5 6]]]
# The original image is flipped vertically, with rows reversed
Notes
- This transform flips the image upside down. The top of the image becomes the bottom and vice versa. - The dimensions of the image remain unchanged. - For multi-channel images (like RGB), each channel is flipped independently. - Bounding boxes are adjusted to match their new positions in the flipped image. - Keypoints are moved to their new positions in the flipped image.