Mixing transforms (augmentations.mixing.transforms)¶

`class MixUp` `(reference_data=None, read_fn=<function MixUp.<lambda> at 0x7fa0b4ab18b0>, alpha=0.4, mix_coef_return_name='mix_coef', always_apply=False, p=0.5)` [view source on GitHub] ¶

Performs MixUp data augmentation, blending images, masks, and class labels with reference data.

MixUp augmentation linearly combines an input (image, mask, and class label) with another set from a predefined reference dataset. The mixing degree is controlled by a parameter λ (lambda), sampled from a Beta distribution. This method is known for improving model generalization by promoting linear behavior between classes and smoothing decision boundaries.

Reference

Zhang, H., Cisse, M., Dauphin, Y.N., and Lopez-Paz, D. (2018). mixup: Beyond Empirical Risk Minimization. In International Conference on Learning Representations. https://arxiv.org/abs/1710.09412

Parameters:

Name	Type	Description
`reference_data`	`Optional[Union[Generator[ReferenceImage, None, None], Sequence[Any]]]`	A sequence or generator of dictionaries containing the reference data for mixing If None or an empty sequence is provided, no operation is performed and a warning is issued.
`read_fn`	`Callable[[ReferenceImage], Dict[str, Any]]`	A function to process items from reference_data. It should accept items from reference_data and return a dictionary containing processed data: - The returned dictionary must include an 'image' key with a numpy array value. - It may also include 'mask', 'global_label' each associated with numpy array values. Defaults to a function that assumes input dictionary contains numpy arrays and directly returns it.
`mix_coef_return_name`	`str`	Name used for the applied alpha coefficient in the returned dictionary. Defaults to "mix_coef".
`alpha`	`float`	The alpha parameter for the Beta distribution, influencing the mix's balance. Must be ≥ 0. Higher values lead to more uniform mixing. Defaults to 0.4.
`p`	`float`	The probability of applying the transformation. Defaults to 0.5.

Targets

image, mask, global_label

Image types: - uint8, float32

Exceptions:

Type	Description
`- ValueError`	If the alpha parameter is negative.
`- NotImplementedError`	If the transform is applied to bounding boxes or keypoints.

Notes

If no reference data is provided, a warning is issued, and the transform acts as a no-op.
Notes if images are in float32 format, they should be within [0, 1] range.

Example Usage: import albumentations as A import numpy as np from albumentations.core.types import ReferenceImage

# Prepare reference data
# Note: This code generates random reference data for demonstration purposes only.
# In real-world applications, it's crucial to use meaningful and representative data.
# The quality and relevance of your input data significantly impact the effectiveness
# of the augmentation process. Ensure your data closely aligns with your specific
# use case and application requirements.
reference_data = [ReferenceImage(image=np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8),
                                 mask=np.random.randint(0, 4, (100, 100, 1), dtype=np.uint8),
                                 global_label=np.random.choice([0, 1], size=3)) for i in range(10)]

# In this example, the lambda function simply returns its input, which works well for
# data already in the expected format. For more complex scenarios, where the data might not be in
# the required format or additional processing is needed, a more sophisticated function can be implemented.
# Below is a hypothetical example where the input data is a file path, # and the function reads the image
# file, converts it to a specific format, and possibly performs other preprocessing steps.

# Example of a more complex read_fn that reads an image from a file path, converts it to RGB, and resizes it.
# def custom_read_fn(file_path):
#     from PIL import Image
#     image = Image.open(file_path).convert('RGB')
#     image = image.resize((100, 100))  # Example resize, adjust as needed.
#     return np.array(image)

# aug = A.Compose([A.RandomRotate90(), A.MixUp(p=1, reference_data=reference_data, read_fn=lambda x: x)])

# For simplicity, the original lambda function is used in this example.
# Replace `lambda x: x` with `custom_read_fn`if you need to process the data more extensively.

# Apply augmentations
image = np.empty([100, 100, 3], dtype=np.uint8)
mask = np.empty([100, 100], dtype=np.uint8)
global_label = np.array([0, 1, 0])
data = aug(image=image, global_label=global_label, mask=mask)
transformed_image = data["image"]
transformed_mask = data["mask"]
transformed_global_label = data["global_label"]

# Print applied mix coefficient
print(data["mix_coef"])  # Output: e.g., 0.9991580344142427