albumentations.augmentations.mixing.overlay
OverlayElements mixing transform.
Members
- classOverlayElements
OverlayElementsclass
OverlayElements(
metadata_key: str = overlay_metadata,
p: float = 0.5
)Apply overlay images/masks onto an input image (e.g. stickers, logos). Optional bboxes and masks for placement. Uses metadata_key. Args: metadata_key (str): Additional target key for metadata. Default `overlay_metadata`. p (float): Probability of applying the transformation. Default: 0.5. Possible Metadata Fields: - image (ImageType): The overlay image to be applied. This is a required field. - bbox (list[int]): The bounding box specifying the region where the overlay should be applied. It should contain four floats: [y_min, x_min, y_max, x_max]. If `label_id` is provided, it should be appended as the fifth element in the bbox. BBox should be in Albumentations format, that is the same as normalized Pascal VOC format [x_min / width, y_min / height, x_max / width, y_max / height] - mask (np.ndarray): An optional mask that defines the non-rectangular region of the overlay image. If not provided, the entire overlay image is used. - mask_id (int): An optional identifier for the mask. If provided, the regions specified by the mask will be labeled with this identifier in the output mask. Targets: image, mask Image types: uint8, float32 References: doc-augmentation: https://github.com/danaaubakirova/doc-augmentation Examples: >>> import numpy as np >>> import albumentations as A >>> import cv2 >>> >>> # Prepare primary data (base image and mask) >>> image = np.zeros((300, 300, 3), dtype=np.uint8) >>> mask = np.zeros((300, 300), dtype=np.uint8) >>> >>> # 1. Create a simple overlay image (a red square) >>> overlay_image1 = np.zeros((50, 50, 3), dtype=np.uint8) >>> overlay_image1[:, :, 0] = 255 # Red color >>> >>> # 2. Create another overlay with a mask (a blue circle with transparency) >>> overlay_image2 = np.zeros((80, 80, 3), dtype=np.uint8) >>> overlay_image2[:, :, 2] = 255 # Blue color >>> overlay_mask2 = np.zeros((80, 80), dtype=np.uint8) >>> # Create a circular mask >>> center = (40, 40) >>> radius = 30 >>> for i in range(80): ... for j in range(80): ... if (i - center[0])**2 + (j - center[1])**2 < radius**2: ... overlay_mask2[i, j] = 255 >>> >>> # 3. Create an overlay with both bbox and mask_id >>> overlay_image3 = np.zeros((60, 120, 3), dtype=np.uint8) >>> overlay_image3[:, :, 1] = 255 # Green color >>> # Create a rectangular mask with rounded corners >>> overlay_mask3 = np.zeros((60, 120), dtype=np.uint8) >>> cv2.rectangle(overlay_mask3, (10, 10), (110, 50), 255, -1) >>> >>> # Create the metadata list - each item is a dictionary with overlay information >>> overlay_metadata = [ ... { ... 'image': overlay_image1, ... # No bbox provided - will be placed randomly ... }, ... { ... 'image': overlay_image2, ... 'bbox': [0.6, 0.1, 0.9, 0.4], # Normalized coordinates [x_min, y_min, x_max, y_max] ... 'mask': overlay_mask2, ... 'mask_id': 1 # This overlay will update the mask with id 1 ... }, ... { ... 'image': overlay_image3, ... 'bbox': [0.1, 0.7, 0.5, 0.9], # Bottom left placement ... 'mask': overlay_mask3, ... 'mask_id': 2 # This overlay will update the mask with id 2 ... } ... ] >>> >>> # Create the transform >>> transform = A.Compose([ ... A.OverlayElements(p=1.0), ... ]) >>> >>> # Apply the transform >>> result = transform( ... image=image, ... mask=mask, ... overlay_metadata=overlay_metadata # Pass metadata using the default key ... ) >>> >>> # Get results with overlays applied >>> result_image = result['image'] # Image with the three overlays applied >>> result_mask = result['mask'] # Mask with regions labeled using the mask_id values >>> >>> # Let's verify the mask contains the specified mask_id values >>> has_mask_id_1 = np.any(result_mask == 1) # Should be True >>> has_mask_id_2 = np.any(result_mask == 2) # Should be True
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| metadata_key | str | overlay_metadata | - |
| p | float | 0.5 | - |