albumentations.augmentations.dropout.xy_masking

View Source on GitHub

Implementation of XY masking for time-frequency domain transformations. This module provides the XYMasking transform, which applies masking strips along the X and Y axes of an image. This is particularly useful for audio spectrograms, time-series data visualizations, and other grid-like data representations where masking in specific directions (time or frequency) can improve model robustness and generalization.

Members

classXYMasking

XYMaskingclass

Try it on Explore Albumentations

XYMasking(
    num_masks_x: tuple[int, int] | int = 0,
    num_masks_y: tuple[int, int] | int = 0,
    mask_x_length: tuple[int, int] | int = 0,
    mask_y_length: tuple[int, int] | int = 0,
    fill: tuple[float, ...] | float | Literal['random', 'random_uniform', 'inpaint_telea', 'inpaint_ns'] = 0,
    fill_mask: tuple[float, ...] | float | None = None,
    p: float = 0.5
)

Applies masking strips to an image, either horizontally (X axis) or vertically (Y axis), simulating occlusions. This transform is useful for training models to recognize images with varied visibility conditions. It's particularly effective for spectrogram images, allowing spectral and frequency masking to improve model robustness. At least one of `max_x_length` or `max_y_length` must be specified, dictating the mask's maximum size along each axis.

Parameters

Name	Type	Default	Description
num_masks_x	One of: tuple[int, int] int	0	Number or range of horizontal regions to mask. Defaults to 0.
num_masks_y	One of: tuple[int, int] int	0	Number or range of vertical regions to mask. Defaults to 0.
mask_x_length	One of: tuple[int, int] int	0	Specifies the length of the masks along the X (horizontal) axis. If an integer is provided, it sets a fixed mask length. If a tuple of two integers (min, max) is provided, the mask length is randomly chosen within this range for each mask. This allows for variable-length masks in the horizontal direction.
mask_y_length	One of: tuple[int, int] int	0	Specifies the height of the masks along the Y (vertical) axis. Similar to `mask_x_length`, an integer sets a fixed mask height, while a tuple (min, max) allows for variable-height masks, chosen randomly within the specified range for each mask. This flexibility facilitates creating masks of various sizes in the vertical direction.
fill	One of: tuple[float, ...] float Literal['random', 'random_uniform', 'inpaint_telea', 'inpaint_ns']	0	Value for the dropped pixels. Can be: - int or float: all channels are filled with this value - tuple: tuple of values for each channel - 'random': each pixel is filled with random values - 'random_uniform': each hole is filled with a single random color - 'inpaint_telea': uses OpenCV Telea inpainting method - 'inpaint_ns': uses OpenCV Navier-Stokes inpainting method Default: 0
fill_mask	One of: tuple[float, ...] float None	None	Fill value for dropout regions in the mask. If None, mask regions corresponding to image dropouts are unchanged. Default: None
p	float	0.5	Probability of applying the transform. Defaults to 0.5.

XYMasking

Navigation

albumentations.augmentations.dropout.xy_masking

Members

XYMaskingclass

Parameters

Table of Contents