albumentations.augmentations.transforms

View Source on GitHub

Module containing image transformation classes for augmentation. This module provides a wide range of image transformation classes for data augmentation. These transformations can modify properties such as color, brightness, contrast, noise levels, and more. Each transformation class inherits from a base transform interface and implements specific augmentation logic.

Members

classAdditiveNoise
classAutoContrast
classBetaParams
classCLAHE
classChannelShuffle
classChromaticAberration
classColorJitter
classDownscale
classEmboss
classEqualize
classFancyPCA
classFromFloat
classGaussNoise
classGaussianParams
classHEStain
classHueSaturationValue
classISONoise
classIllumination
classImageCompression
classInterpolationPydantic
classInvertImg
classLambda
classLaplaceParams
classMorphological
classMultiplicativeNoise
classNoiseParamsBase
classNormalize
classPlanckianJitter
classPlasmaBrightnessContrast
classPlasmaShadow
classPosterize
classRGBShift
classRandomBrightnessContrast
classRandomFog
classRandomGamma
classRandomGravel
classRandomRain
classRandomShadow
classRandomSnow
classRandomSunFlare
classRandomToneCurve
classRingingOvershoot
classSaltAndPepper
classSharpen
classShotNoise
classSolarize
classSpatter
classSuperpixels
classToFloat
classToGray
classToRGB
classToSepia
classUniformParams
classUnsharpMask

AdditiveNoiseclass

Try it on Explore Albumentations

AdditiveNoise(
    noise_type: Literal['uniform', 'gaussian', 'laplace', 'beta'] = uniform,
    spatial_mode: Literal['constant', 'per_pixel', 'shared'] = constant,
    noise_params: dict[str, Any] | None = None,
    approximation: float = 1.0,
    p: float = 0.5
)

Apply random noise to image channels using various noise distributions. This transform generates noise using different probability distributions and applies it to image channels. The noise can be generated in three spatial modes and supports multiple noise distributions, each with configurable parameters.

Parameters

Name	Type	Default	Description
noise_type	One of: 'uniform' 'gaussian' 'laplace' 'beta'	uniform	Type of noise distribution to use. Options: - "uniform": Uniform distribution, good for simple random perturbations - "gaussian": Normal distribution, models natural random processes - "laplace": Similar to Gaussian but with heavier tails, good for outliers - "beta": Flexible bounded distribution, can be symmetric or skewed
spatial_mode	One of: 'constant' 'per_pixel' 'shared'	constant	How to generate and apply the noise. Options: - "constant": One noise value per channel, fastest - "per_pixel": Independent noise value for each pixel and channel, slowest - "shared": One noise map shared across all channels, medium speed
noise_params	One of: dict[str, Any] None	None	Parameters for the chosen noise distribution. Must match the noise_type: uniform: ranges: list[tuple[float, float]] List of (min, max) ranges for each channel. Each range must be in [-1, 1]. If only one range is provided, it will be used for all channels. [(-0.2, 0.2)] # Same range for all channels [(-0.2, 0.2), (-0.1, 0.1), (-0.1, 0.1)] # Different ranges for RGB gaussian: mean_range: tuple[float, float], default (0.0, 0.0) Range for sampling mean value, in [-1, 1] std_range: tuple[float, float], default (0.1, 0.1) Range for sampling standard deviation, in [0, 1] laplace: mean_range: tuple[float, float], default (0.0, 0.0) Range for sampling location parameter, in [-1, 1] scale_range: tuple[float, float], default (0.1, 0.1) Range for sampling scale parameter, in [0, 1] beta: alpha_range: tuple[float, float], default (0.5, 1.5) Value < 1 = U-shaped, Value > 1 = Bell-shaped Range for sampling first shape parameter, in (0, inf) beta_range: tuple[float, float], default (0.5, 1.5) Value < 1 = U-shaped, Value > 1 = Bell-shaped Range for sampling second shape parameter, in (0, inf) scale_range: tuple[float, float], default (0.1, 0.3) Smaller scale for subtler noise Range for sampling output scale, in [0, 1]
approximation	float	1.0	float in [0, 1], default=1.0 Controls noise generation speed vs quality tradeoff. - 1.0: Generate full resolution noise (slowest, highest quality) - 0.5: Generate noise at half resolution and upsample - 0.25: Generate noise at quarter resolution and upsample Only affects 'per_pixel' and 'shared' spatial modes.
p	float	0.5	-

AutoContrastclass

Try it on Explore Albumentations

AutoContrast(
    cutoff: float = 0,
    ignore: int | None = None,
    method: Literal['cdf', 'pil'] = cdf,
    p: float = 0.5
)

Automatically adjust image contrast by stretching the intensity range. This transform provides two methods for contrast enhancement: 1. CDF method (default): Uses cumulative distribution function for more gradual adjustment 2. PIL method: Uses linear scaling like PIL.ImageOps.autocontrast The transform can optionally exclude extreme values from both ends of the intensity range and preserve specific intensity values (e.g., alpha channel).

Parameters

Name	Type	Default	Description
cutoff	float	0	Percentage of pixels to exclude from both ends of the histogram. Range: [0, 100]. Default: 0 (use full intensity range) - 0 means use the minimum and maximum intensity values found - 20 means exclude darkest and brightest 20% of pixels
ignore	One of: int None	None	Intensity value to preserve (e.g., alpha channel). Range: [0, 255]. Default: None - If specified, this intensity value will not be modified - Useful for images with alpha channel or special marker values
method	One of: 'cdf' 'pil'	cdf	Algorithm to use for contrast enhancement. Default: "cdf" - "cdf": Uses cumulative distribution for smoother adjustment - "pil": Uses linear scaling like PIL.ImageOps.autocontrast
p	float	0.5	Probability of applying the transform. Default: 0.5

Notes

- The transform processes each color channel independently - For grayscale images, only one channel is processed - The output maintains the same dtype as input - Empty or single-color channels remain unchanged

BetaParamsclass

BetaParams(
    noise_type: Literal = beta,
    alpha_range: Annotated,
    beta_range: Annotated,
    scale_range: Annotated
)

Parameters

Name	Type	Default	Description
noise_type	Literal	beta	-
alpha_range	Annotated	-	-
beta_range	Annotated	-	-
scale_range	Annotated	-	-

CLAHEclass

Try it on Explore Albumentations

CLAHE(
    clip_limit: tuple[float, float] | float = 4.0,
    tile_grid_size: tuple[int, int] = (8, 8),
    p: float = 0.5
)

Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to the input image. CLAHE is an advanced method of improving the contrast in an image. Unlike regular histogram equalization, which operates on the entire image, CLAHE operates on small regions (tiles) in the image. This results in a more balanced equalization, preventing over-amplification of contrast in areas with initially low contrast.

Parameters

Name	Type	Default	Description
clip_limit	One of: tuple[float, float] float	4.0	Controls the contrast enhancement limit. - If a single float is provided, the range will be (1, clip_limit). - If a tuple of two floats is provided, it defines the range for random selection. Higher values allow for more contrast enhancement, but may also increase noise. Default: (1, 4)
tile_grid_size	tuple[int, int]	(8, 8)	Defines the number of tiles in the row and column directions. Format is (rows, columns). Smaller tile sizes can lead to more localized enhancements, while larger sizes give results closer to global histogram equalization. Default: (8, 8)
p	float	0.5	Probability of applying the transform. Default: 0.5

Example

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.CLAHE(clip_limit=(1, 4), tile_grid_size=(8, 8), p=1.0)
>>> result = transform(image=image)
>>> clahe_image = result["image"]

Notes

- Supports only RGB or grayscale images. - For color images, CLAHE is applied to the L channel in the LAB color space. - The clip limit determines the maximum slope of the cumulative histogram. A lower clip limit will result in more contrast limiting. - Tile grid size affects the adaptiveness of the method. More tiles increase local adaptiveness but can lead to an unnatural look if set too high.

References

Tutorial: https://docs.opencv.org/master/d5/daf/tutorial_py_histogram_equalization.html
"Contrast Limited Adaptive Histogram Equalization.": https://ieeexplore.ieee.org/document/109340

ChannelShuffleclass

Try it on Explore Albumentations

ChannelShuffle(
    p: float = 0.5
)

Randomly rearrange channels of the image.

Parameters

Name	Type	Default	Description
p	float	0.5	Probability of applying the transform. Default: 0.5.

ChromaticAberrationclass

Try it on Explore Albumentations

ChromaticAberration(
    primary_distortion_limit: tuple[float, float] | float = (-0.02, 0.02),
    secondary_distortion_limit: tuple[float, float] | float = (-0.05, 0.05),
    mode: Literal['green_purple', 'red_blue', 'random'] = green_purple,
    interpolation: Literal[cv2.INTER_NEAREST, cv2.INTER_NEAREST_EXACT, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4, cv2.INTER_LINEAR_EXACT] = 1,
    p: float = 0.5
)

Add lateral chromatic aberration by distorting the red and blue channels of the input image. Chromatic aberration is an optical effect that occurs when a lens fails to focus all colors to the same point. This transform simulates this effect by applying different radial distortions to the red and blue channels of the image, while leaving the green channel unchanged.

Parameters

Name	Type	Default	Description
primary_distortion_limit	One of: tuple[float, float] float	(-0.02, 0.02)	Range of the primary radial distortion coefficient. If a single float value is provided, the range will be (-primary_distortion_limit, primary_distortion_limit). This parameter controls the distortion in the center of the image: - Positive values result in pincushion distortion (edges bend inward) - Negative values result in barrel distortion (edges bend outward) Default: (-0.02, 0.02).
secondary_distortion_limit	One of: tuple[float, float] float	(-0.05, 0.05)	Range of the secondary radial distortion coefficient. If a single float value is provided, the range will be (-secondary_distortion_limit, secondary_distortion_limit). This parameter controls the distortion in the corners of the image: - Positive values enhance pincushion distortion - Negative values enhance barrel distortion Default: (-0.05, 0.05).
mode	One of: 'green_purple' 'red_blue' 'random'	green_purple	Type of color fringing to apply. Options are: - 'green_purple': Distorts red and blue channels in opposite directions, creating green-purple fringing. - 'red_blue': Distorts red and blue channels in the same direction, creating red-blue fringing. - 'random': Randomly chooses between 'green_purple' and 'red_blue' modes for each application. Default: 'green_purple'.
interpolation	One of: cv2.INTER_NEAREST cv2.INTER_NEAREST_EXACT cv2.INTER_LINEAR cv2.INTER_CUBIC cv2.INTER_AREA cv2.INTER_LANCZOS4 cv2.INTER_LINEAR_EXACT	1	Flag specifying the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
p	float	0.5	Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5.

Example

>>> import albumentations as A
>>> import cv2
>>> transform = A.ChromaticAberration(
...     primary_distortion_limit=0.05,
...     secondary_distortion_limit=0.1,
...     mode='green_purple',
...     interpolation=cv2.INTER_LINEAR,
...     p=1.0
... )
>>> transformed = transform(image=image)
>>> aberrated_image = transformed['image']

Notes

- This transform only affects RGB images. Grayscale images will raise an error. - The strength of the effect depends on both primary and secondary distortion limits. - Higher absolute values for distortion limits will result in more pronounced chromatic aberration. - The 'green_purple' mode tends to produce more noticeable effects than 'red_blue'.

References

Chromatic Aberration: https://en.wikipedia.org/wiki/Chromatic_aberration

ColorJitterclass

Try it on Explore Albumentations

ColorJitter(
    brightness: tuple[float, float] | float = (0.8, 1.2),
    contrast: tuple[float, float] | float = (0.8, 1.2),
    saturation: tuple[float, float] | float = (0.8, 1.2),
    hue: tuple[float, float] | float = (-0.5, 0.5),
    p: float = 0.5
)

Randomly changes the brightness, contrast, saturation, and hue of an image. This transform is similar to torchvision's ColorJitter but with some differences due to the use of OpenCV instead of Pillow. The main differences are: 1. OpenCV and Pillow use different formulas to convert images to HSV format. 2. This implementation uses value saturation instead of uint8 overflow as in Pillow. These differences may result in slightly different output compared to torchvision's ColorJitter.

Parameters

Name	Type	Default	Description
brightness	One of: tuple[float, float] float	(0.8, 1.2)	How much to jitter brightness. If float: The brightness factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness]. If tuple: The brightness factor is sampled from the range specified. Should be non-negative numbers. Default: (0.8, 1.2)
contrast	One of: tuple[float, float] float	(0.8, 1.2)	How much to jitter contrast. If float: The contrast factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast]. If tuple: The contrast factor is sampled from the range specified. Should be non-negative numbers. Default: (0.8, 1.2)
saturation	One of: tuple[float, float] float	(0.8, 1.2)	How much to jitter saturation. If float: The saturation factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation]. If tuple: The saturation factor is sampled from the range specified. Should be non-negative numbers. Default: (0.8, 1.2)
hue	One of: tuple[float, float] float	(-0.5, 0.5)	How much to jitter hue. If float: The hue factor is chosen uniformly from [-hue, hue]. Should have 0 <= hue <= 0.5. If tuple: The hue factor is sampled from the range specified. Values should be in range [-0.5, 0.5]. Default: (-0.5, 0.5) p (float): Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5
p	float	0.5	-

Example

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1, p=1.0)
>>> result = transform(image=image)
>>> jittered_image = result['image']

Notes

- The order of application for these color transformations is random for each image. - The ranges for brightness, contrast, and saturation are applied as multiplicative factors. - The range for hue is applied as an additive factor.

References

ColorJitter: https://pytorch.org/vision/stable/generated/torchvision.transforms.ColorJitter.html
Color Conversions: https://docs.opencv.org/3.4/de/d25/imgproc_color_conversions.html

Downscaleclass

Try it on Explore Albumentations

Downscale(
    scale_range: tuple[float, float] = (0.25, 0.25),
    interpolation_pair: dict[Literal['downscale', 'upscale'], Literal[cv2.INTER_NEAREST, cv2.INTER_NEAREST_EXACT, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4, cv2.INTER_LINEAR_EXACT]] = {'upscale': 0, 'downscale': 0},
    p: float = 0.5
)

Decrease image quality by downscaling and upscaling back. This transform simulates the effect of a low-resolution image by first downscaling the image to a lower resolution and then upscaling it back to its original size. This process introduces loss of detail and can be used to simulate low-quality images or to test the robustness of models to different image resolutions.

Parameters

Name	Type	Default	Description
scale_range	tuple[float, float]	(0.25, 0.25)	Range for the downscaling factor. Should be two float values between 0 and 1, where the first value is less than or equal to the second. The actual downscaling factor will be randomly chosen from this range for each image. Lower values result in more aggressive downscaling. Default: (0.25, 0.25)
interpolation_pair	dict[Literal['downscale', 'upscale'], Literal[cv2.INTER_NEAREST, cv2.INTER_NEAREST_EXACT, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4, cv2.INTER_LINEAR_EXACT]]	{'upscale': 0, 'downscale': 0}	A dictionary specifying the interpolation methods to use for downscaling and upscaling. Should contain two keys: - 'downscale': Interpolation method for downscaling - 'upscale': Interpolation method for upscaling Values should be OpenCV interpolation flags (e.g., cv2.INTER_NEAREST, cv2.INTER_LINEAR, etc.) Default: {'downscale': cv2.INTER_NEAREST, 'upscale': cv2.INTER_NEAREST}
p	float	0.5	Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5

Example

>>> import albumentations as A
>>> import cv2
>>> transform = A.Downscale(
...     scale_range=(0.5, 0.75),
...     interpolation_pair={'downscale': cv2.INTER_NEAREST, 'upscale': cv2.INTER_LINEAR},
...     p=0.5
... )
>>> transformed = transform(image=image)
>>> downscaled_image = transformed['image']

Notes

- The actual downscaling factor is randomly chosen for each image from the range specified in scale_range. - Using different interpolation methods for downscaling and upscaling can produce various effects. For example, using INTER_NEAREST for both can create a pixelated look, while using INTER_LINEAR or INTER_CUBIC can produce smoother results. - This transform can be useful for data augmentation, especially when training models that need to be robust to variations in image quality or resolution.

Embossclass

Try it on Explore Albumentations

Emboss(
    alpha: tuple[float, float] = (0.2, 0.5),
    strength: tuple[float, float] = (0.2, 0.7),
    p: float = 0.5
)

Apply embossing effect to the input image. This transform creates an emboss effect by highlighting edges and creating a 3D-like texture in the image. It works by applying a specific convolution kernel to the image that emphasizes differences in adjacent pixel values.

Parameters

Name	Type	Default	Description
alpha	tuple[float, float]	(0.2, 0.5)	Range to choose the visibility of the embossed image. At 0, only the original image is visible, at 1.0 only its embossed version is visible. Values should be in the range [0, 1]. Alpha will be randomly selected from this range for each image. Default: (0.2, 0.5)
strength	tuple[float, float]	(0.2, 0.7)	Range to choose the strength of the embossing effect. Higher values create a more pronounced 3D effect. Values should be non-negative. Strength will be randomly selected from this range for each image. Default: (0.2, 0.7)
p	float	0.5	Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5

Example

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.Emboss(alpha=(0.2, 0.5), strength=(0.2, 0.7), p=0.5)
>>> result = transform(image=image)
>>> embossed_image = result['image']

Notes

- The emboss effect is created using a 3x3 convolution kernel. - The 'alpha' parameter controls the blend between the original image and the embossed version. A higher alpha value will result in a more pronounced emboss effect. - The 'strength' parameter affects the intensity of the embossing. Higher strength values will create more contrast in the embossed areas, resulting in a stronger 3D-like effect. - This transform can be useful for creating artistic effects or for data augmentation in tasks where edge information is important.

References

Image Embossing: https://en.wikipedia.org/wiki/Image_embossing
Application of Emboss Filtering in Image Processing: https://www.researchgate.net/publication/303412455_Application_of_Emboss_Filtering_in_Image_Processing

Equalizeclass

Try it on Explore Albumentations

Equalize(
    mode: Literal['cv', 'pil'] = cv,
    by_channels: bool = True,
    mask: np.ndarray | Callable[..., Any] | None = None,
    mask_params: Sequence[str] = (),
    p: float = 0.5
)

Equalize the image histogram. This transform applies histogram equalization to the input image. Histogram equalization is a method in image processing of contrast adjustment using the image's histogram.

Parameters

Name	Type	Default	Description
mode	One of: 'cv' 'pil'	cv	Use OpenCV or Pillow equalization method. Default: 'cv'
by_channels	bool	True	If True, use equalization by channels separately, else convert image to YCbCr representation and use equalization by `Y` channel. Default: True
mask	One of: np.ndarray Callable[..., Any] None	None	If given, only the pixels selected by the mask are included in the analysis. Can be: - A 1-channel or 3-channel numpy array of the same size as the input image. - A callable (function) that generates a mask. The function should accept 'image' as its first argument, and can accept additional arguments specified in mask_params. Default: None
mask_params	Sequence[str]	()	Additional parameters to pass to the mask function. These parameters will be taken from the data dict passed to __call__. Default: ()
p	float	0.5	Probability of applying the transform. Default: 0.5.

Example

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>>
>>> # Using a static mask
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> transform = A.Equalize(mask=mask, p=1.0)
>>> result = transform(image=image)
>>>
>>> # Using a dynamic mask function
>>> def mask_func(image, bboxes):
...     mask = np.ones_like(image[:, :, 0], dtype=np.uint8)
...     for bbox in bboxes:
...         x1, y1, x2, y2 = map(int, bbox)
...         mask[y1:y2, x1:x2] = 0  # Exclude areas inside bounding boxes
...     return mask
>>>
>>> transform = A.Equalize(mask=mask_func, mask_params=['bboxes'], p=1.0)
>>> bboxes = [(10, 10, 50, 50), (60, 60, 90, 90)]  # Example bounding boxes
>>> result = transform(image=image, bboxes=bboxes)

Notes

- When mode='cv', OpenCV's equalizeHist() function is used. - When mode='pil', Pillow's equalize() function is used. - The 'by_channels' parameter determines whether equalization is applied to each color channel independently (True) or to the luminance channel only (False). - If a mask is provided as a numpy array, it should have the same height and width as the input image. - If a mask is provided as a function, it allows for dynamic mask generation based on the input image and additional parameters. This is useful for scenarios where the mask depends on the image content or external data (e.g., bounding boxes, segmentation masks).

References

OpenCV equalizeHist: https://docs.opencv.org/3.4/d6/dc7/group__imgproc__hist.html#ga7e54091f0c937d49bf84152a16f76d6e
Pillow ImageOps.equalize: https://pillow.readthedocs.io/en/stable/reference/ImageOps.html#PIL.ImageOps.equalize
Histogram Equalization: https://en.wikipedia.org/wiki/Histogram_equalization

FancyPCAclass

Try it on Explore Albumentations

FancyPCA(
    alpha: float = 0.1,
    p: float = 0.5
)

Apply Fancy PCA augmentation to the input image. This augmentation technique applies PCA (Principal Component Analysis) to the image's color channels, then adds multiples of the principal components to the image, with magnitudes proportional to the corresponding eigenvalues times a random variable drawn from a Gaussian with mean 0 and standard deviation 'alpha'.

Parameters

Name	Type	Default	Description
alpha	float	0.1	Standard deviation of the Gaussian distribution used to generate random noise for each principal component. Default: 0.1.
p	float	0.5	Probability of applying the transform. Default: 0.5.

Example

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.FancyPCA(alpha=0.1, p=1.0)
>>> result = transform(image=image)
>>> augmented_image = result["image"]

Notes

- This augmentation is particularly effective for RGB images but can work with any number of channels. - For grayscale images, it applies a simplified version of the augmentation. - The transform preserves the mean of the image while adjusting the color/intensity variation. - This implementation is based on the paper by Krizhevsky et al. and is similar to the one used in the original AlexNet paper.

References

ImageNet Classification with Deep Convolutional Neural Networks: In Advances in Neural Information

FromFloatclass

Try it on Explore Albumentations

FromFloat(
    dtype: Literal['uint8', 'uint16', 'uint32'] = uint8,
    max_value: float | None = None,
    p: float = 1.0
)

Convert an image from floating point representation to the specified data type. This transform is designed to convert images from a normalized floating-point representation (typically with values in the range [0, 1]) to other data types, scaling the values appropriately.

Parameters

Name	Type	Default	Description
dtype	One of: 'uint8' 'uint16' 'uint32'	uint8	The desired output data type. Supported types include 'uint8', 'uint16', 'uint32'. Default: 'uint8'.
max_value	One of: float None	None	The maximum value for the output dtype. If None, the transform will attempt to infer the maximum value based on the dtype. Default: None.
p	float	1.0	Probability of applying the transform. Default: 1.0.

Example

>>> import numpy as np
>>> import albumentations as A
>>> transform = A.FromFloat(dtype='uint8', max_value=None, p=1.0)
>>> image = np.random.rand(100, 100, 3).astype(np.float32)  # Float image in [0, 1] range
>>> result = transform(image=image)
>>> uint8_image = result['image']
>>> assert uint8_image.dtype == np.uint8
>>> assert uint8_image.min() >= 0 and uint8_image.max() <= 255

Notes

- This is the inverse transform for ToFloat. - Input images are expected to be in floating point format with values in the range [0, 1]. - For integer output types (uint8, uint16, uint32), the function will scale the values to the appropriate range (e.g., 0-255 for uint8). - For float output types (float32, float64), the values will remain in the [0, 1] range. - The transform uses the `from_float` function internally, which ensures output values are within the valid range for the specified dtype.

GaussNoiseclass

Try it on Explore Albumentations

GaussNoise(
    std_range: tuple[float, float] = (0.2, 0.44),
    mean_range: tuple[float, float] = (0.0, 0.0),
    per_channel: bool = True,
    noise_scale_factor: float = 1,
    p: float = 0.5
)

Apply Gaussian noise to the input image.

Parameters

Name	Type	Default	Description
std_range	tuple[float, float]	(0.2, 0.44)	Range for noise standard deviation as a fraction of the maximum value (255 for uint8 images or 1.0 for float images). Values should be in range [0, 1]. Default: (0.2, 0.44).
mean_range	tuple[float, float]	(0.0, 0.0)	Range for noise mean as a fraction of the maximum value (255 for uint8 images or 1.0 for float images). Values should be in range [-1, 1]. Default: (0.0, 0.0).
per_channel	bool	True	If True, noise will be sampled for each channel independently. Otherwise, the noise will be sampled once for all channels. Default: True.
noise_scale_factor	float	1	Scaling factor for noise generation. Value should be in the range (0, 1]. When set to 1, noise is sampled for each pixel independently. If less, noise is sampled for a smaller size and resized to fit the shape of the image. Smaller values make the transform faster. Default: 1.0.
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- The noise parameters (std_range and mean_range) are normalized to [0, 1] range: * For uint8 images, they are multiplied by 255 * For float32 images, they are used directly - Setting per_channel=False is faster but applies the same noise to all channels - The noise_scale_factor parameter allows for a trade-off between transform speed and noise granularity

GaussianParamsclass

GaussianParams(
    noise_type: Literal = gaussian,
    mean_range: Annotated,
    std_range: Annotated
)

Parameters

Name	Type	Default	Description
noise_type	Literal	gaussian	-
mean_range	Annotated	-	-
std_range	Annotated	-	-

HEStainclass

Try it on Explore Albumentations

HEStain(
    method: Literal['preset', 'random_preset', 'vahadane', 'macenko'] = random_preset,
    preset: Literal['ruifrok', 'macenko', 'standard', 'high_contrast', 'h_heavy', 'e_heavy', 'dark', 'light'] | None = None,
    intensity_scale_range: tuple[float, float] = (0.7, 1.3),
    intensity_shift_range: tuple[float, float] = (-0.2, 0.2),
    augment_background: bool = False,
    p: float = 0.5
)

Applies H&E (Hematoxylin and Eosin) stain augmentation to histopathology images. This transform simulates different H&E staining conditions using either: 1. Predefined stain matrices (8 standard references) 2. Vahadane method for stain extraction 3. Macenko method for stain extraction 4. Custom stain matrices

Parameters

Name	Type	Default	Description
method	One of: 'preset' 'random_preset' 'vahadane' 'macenko'	random_preset	Method to use for stain augmentation: - "preset": Use predefined stain matrices - "random_preset": Randomly select a preset matrix each time - "vahadane": Extract using Vahadane method - "macenko": Extract using Macenko method Default: "preset"
preset	One of: 'ruifrok' 'macenko' 'standard' 'high_contrast' 'h_heavy' 'e_heavy' 'dark' 'light'] \| Non	None	Preset stain matrix to use when method="preset": - "ruifrok": Standard reference from Ruifrok & Johnston - "macenko": Reference from Macenko's method - "standard": Typical bright-field microscopy - "high_contrast": Enhanced contrast - "h_heavy": Hematoxylin dominant - "e_heavy": Eosin dominant - "dark": Darker staining - "light": Lighter staining Default: "standard"
intensity_scale_range	tuple[float, float]	(0.7, 1.3)	Range for multiplicative stain intensity variation. Values are multipliers between 0.5 and 1.5. For example: - (0.7, 1.3) means stain intensities will vary from 70% to 130% - (0.9, 1.1) gives subtle variations - (0.5, 1.5) gives dramatic variations Default: (0.7, 1.3)
intensity_shift_range	tuple[float, float]	(-0.2, 0.2)	Range for additive stain intensity variation. Values between -0.3 and 0.3. For example: - (-0.2, 0.2) means intensities will be shifted by -20% to +20% - (-0.1, 0.1) gives subtle shifts - (-0.3, 0.3) gives dramatic shifts Default: (-0.2, 0.2)
augment_background	bool	False	Whether to apply augmentation to background regions. Default: False
p	float	0.5	-

References

A. C. Ruifrok and D. A. Johnston, "Quantification of histochemical": Analytical and quantitative cytology and histology, 2001.
M. Macenko et al., "A method for normalizing histology slides for: 2009 IEEE International Symposium on quantitative analysis," 2009 IEEE International Symposium on Biomedical Imaging, 2009.

HueSaturationValueclass

Try it on Explore Albumentations

HueSaturationValue(
    hue_shift_limit: tuple[float, float] | float = (-20, 20),
    sat_shift_limit: tuple[float, float] | float = (-30, 30),
    val_shift_limit: tuple[float, float] | float = (-20, 20),
    p: float = 0.5
)

Randomly change hue, saturation and value of the input image. This transform adjusts the HSV (Hue, Saturation, Value) channels of an input RGB image. It allows for independent control over each channel, providing a wide range of color and brightness modifications.

Parameters

Name	Type	Default	Description
hue_shift_limit	One of: tuple[float, float] float	(-20, 20)	Range for changing hue. If a single float value is provided, the range will be (-hue_shift_limit, hue_shift_limit). Values should be in the range [-180, 180]. Default: (-20, 20).
sat_shift_limit	One of: tuple[float, float] float	(-30, 30)	Range for changing saturation. If a single float value is provided, the range will be (-sat_shift_limit, sat_shift_limit). Values should be in the range [-255, 255]. Default: (-30, 30).
val_shift_limit	One of: tuple[float, float] float	(-20, 20)	Range for changing value (brightness). If a single float value is provided, the range will be (-val_shift_limit, val_shift_limit). Values should be in the range [-255, 255]. Default: (-20, 20).
p	float	0.5	Probability of applying the transform. Default: 0.5.

Example

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.HueSaturationValue(
...     hue_shift_limit=20,
...     sat_shift_limit=30,
...     val_shift_limit=20,
...     p=0.7
... )
>>> result = transform(image=image)
>>> augmented_image = result["image"]

Notes

- The transform first converts the input RGB image to the HSV color space. - Each channel (Hue, Saturation, Value) is adjusted independently. - Hue is circular, so it wraps around at 180 degrees. - For float32 images, the shift values are applied as percentages of the full range. - This transform is particularly useful for color augmentation and simulating different lighting conditions.

References

HSV color space: https://en.wikipedia.org/wiki/HSL_and_HSV

ISONoiseclass

Try it on Explore Albumentations

ISONoise(
    color_shift: tuple[float, float] = (0.01, 0.05),
    intensity: tuple[float, float] = (0.1, 0.5),
    p: float = 0.5
)

Applies camera sensor noise to the input image, simulating high ISO settings. This transform adds random noise to an image, mimicking the effect of using high ISO settings in digital photography. It simulates two main components of ISO noise: 1. Color noise: random shifts in color hue 2. Luminance noise: random variations in pixel intensity

Parameters

Name	Type	Default	Description
color_shift	tuple[float, float]	(0.01, 0.05)	Range for changing color hue. Values should be in the range [0, 1], where 1 represents a full 360° hue rotation. Default: (0.01, 0.05)
intensity	tuple[float, float]	(0.1, 0.5)	Range for the noise intensity. Higher values increase the strength of both color and luminance noise. Default: (0.1, 0.5)
p	float	0.5	Probability of applying the transform. Default: 0.5

Example

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.ISONoise(color_shift=(0.01, 0.05), intensity=(0.1, 0.5), p=0.5)
>>> result = transform(image=image)
>>> noisy_image = result["image"]

Notes

- This transform only works with RGB images. It will raise a TypeError if applied to non-RGB images. - The color shift is applied in the HSV color space, affecting the hue channel. - Luminance noise is added to all channels independently. - This transform can be useful for data augmentation in low-light scenarios or when training models to be robust against noisy inputs.

References

ISO noise in digital photography: https://en.wikipedia.org/wiki/Image_noise#In_digital_cameras

Illuminationclass

Try it on Explore Albumentations

Illumination(
    mode: Literal['linear', 'corner', 'gaussian'] = linear,
    intensity_range: tuple[float, float] = (0.01, 0.2),
    effect_type: Literal['brighten', 'darken', 'both'] = both,
    angle_range: tuple[float, float] = (0, 360),
    center_range: tuple[float, float] = (0.1, 0.9),
    sigma_range: tuple[float, float] = (0.2, 1.0),
    p: float = 0.5
)

Apply various illumination effects to the image. This transform simulates different lighting conditions by applying controlled illumination patterns. It can create effects like: - Directional lighting (linear mode) - Corner shadows/highlights (corner mode) - Spotlights or local lighting (gaussian mode) These effects can be used to: - Simulate natural lighting variations - Add dramatic lighting effects - Create synthetic shadows or highlights - Augment training data with different lighting conditions

Parameters

Name	Type	Default	Description
mode	One of: 'linear' 'corner' 'gaussian'	linear	Type of illumination pattern: - 'linear': Creates a smooth gradient across the image, simulating directional lighting like sunlight through a window - 'corner': Applies gradient from any corner, simulating light source from a corner - 'gaussian': Creates a circular spotlight effect, simulating local light sources Default: 'linear'
intensity_range	tuple[float, float]	(0.01, 0.2)	Range for effect strength. Values between 0.01 and 0.2: - 0.01-0.05: Subtle lighting changes - 0.05-0.1: Moderate lighting effects - 0.1-0.2: Strong lighting effects Default: (0.01, 0.2)
effect_type	One of: 'brighten' 'darken' 'both'	both	Type of lighting change: - 'brighten': Only adds light (like a spotlight) - 'darken': Only removes light (like a shadow) - 'both': Randomly chooses between brightening and darkening Default: 'both'
angle_range	tuple[float, float]	(0, 360)	Range for gradient angle in degrees. Controls direction of linear gradient: - 0°: Left to right - 90°: Top to bottom - 180°: Right to left - 270°: Bottom to top Only used for 'linear' mode. Default: (0, 360)
center_range	tuple[float, float]	(0.1, 0.9)	Range for spotlight position. Values between 0 and 1 representing relative position: - (0, 0): Top-left corner - (1, 1): Bottom-right corner - (0.5, 0.5): Center of image Only used for 'gaussian' mode. Default: (0.1, 0.9)
sigma_range	tuple[float, float]	(0.2, 1.0)	Range for spotlight size. Values between 0.2 and 1.0: - 0.2: Small, focused spotlight - 0.5: Medium-sized light area - 1.0: Broad, soft lighting Only used for 'gaussian' mode. Default: (0.2, 1.0)
p	float	0.5	Probability of applying the transform. Default: 0.5

Notes

- The transform preserves image range and dtype - Effects are applied multiplicatively to preserve texture - Can be combined with other transforms for complex lighting scenarios - Useful for training models to be robust to lighting variations

References

Lighting in Computer Vision: https://en.wikipedia.org/wiki/Lighting_in_computer_vision
Image-based lighting: https://en.wikipedia.org/wiki/Image-based_lighting
Similar implementation in Kornia: https://kornia.readthedocs.io/en/latest/augmentation.html#randomlinearillumination
Research on lighting augmentation: "Learning Deep Representations of Fine-grained Visual Descriptions" https://arxiv.org/abs/1605.05395
Photography lighting patterns: https://en.wikipedia.org/wiki/Lighting_pattern

ImageCompressionclass

Try it on Explore Albumentations

ImageCompression(
    compression_type: Literal['jpeg', 'webp'] = jpeg,
    quality_range: tuple[int, int] = (99, 100),
    p: float = 0.5
)

Decrease image quality by applying JPEG or WebP compression. This transform simulates the effect of saving an image with lower quality settings, which can introduce compression artifacts. It's useful for data augmentation and for testing model robustness against varying image qualities.

Parameters

Name	Type	Default	Description
compression_type	One of: 'jpeg' 'webp'	jpeg	Type of compression to apply. - "jpeg": JPEG compression - "webp": WebP compression Default: "jpeg"
quality_range	tuple[int, int]	(99, 100)	Range for the compression quality. The values should be in [1, 100] range, where: - 1 is the lowest quality (maximum compression) - 100 is the highest quality (minimum compression) Default: (99, 100)
p	float	0.5	Probability of applying the transform. Default: 0.5.

Example

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.ImageCompression(quality_range=(50, 90), compression_type=0, p=1.0)
>>> result = transform(image=image)
>>> compressed_image = result["image"]

Notes

- This transform expects images with 1, 3, or 4 channels. - For JPEG compression, alpha channels (4th channel) will be ignored. - WebP compression supports transparency (4 channels). - The actual file is not saved to disk; the compression is simulated in memory. - Lower quality values result in smaller file sizes but may introduce visible artifacts. - This transform can be useful for: * Data augmentation to improve model robustness * Testing how models perform on images of varying quality * Simulating images transmitted over low-bandwidth connections

References

JPEG compression: https://en.wikipedia.org/wiki/JPEG
WebP compression: https://developers.google.com/speed/webp

InterpolationPydanticclass

InterpolationPydantic(
    upscale: Literal,
    downscale: Literal
)

Parameters

Name	Type	Default	Description
upscale	Literal	-	-
downscale	Literal	-	-

InvertImgclass

Try it on Explore Albumentations

InvertImg(
    p: float = 0.5
)

Invert the input image by subtracting pixel values from max values of the image types, i.e., 255 for uint8 and 1.0 for float32.

Parameters

Name	Type	Default	Description
p	float	0.5	Probability of applying the transform. Default: 0.5.

Lambdaclass

Try it on Explore Albumentations

Lambda(
    image: Callable[..., Any] | None = None,
    mask: Callable[..., Any] | None = None,
    keypoints: Callable[..., Any] | None = None,
    bboxes: Callable[..., Any] | None = None,
    name: str | None = None,
    p: float = 1.0
)

A flexible transformation class for using user-defined transformation functions per targets. Function signature must include **kwargs to accept optional arguments like interpolation method, image size, etc:

Parameters

Name	Type	Default	Description
image	One of: Callable[..., Any] None	None	Image transformation function.
mask	One of: Callable[..., Any] None	None	Mask transformation function.
keypoints	One of: Callable[..., Any] None	None	Keypoints transformation function.
bboxes	One of: Callable[..., Any] None	None	BBoxes transformation function.
name	One of: str None	None	-
p	float	1.0	probability of applying the transform. Default: 1.0.

LaplaceParamsclass

LaplaceParams(
    noise_type: Literal = laplace,
    mean_range: Annotated,
    scale_range: Annotated
)

Parameters

Name	Type	Default	Description
noise_type	Literal	laplace	-
mean_range	Annotated	-	-
scale_range	Annotated	-	-

Morphologicalclass

Try it on Explore Albumentations

Morphological(
    scale: tuple[int, int] | int = (2, 3),
    operation: Literal['erosion', 'dilation'] = dilation,
    p: float = 0.5
)

Apply a morphological operation (dilation or erosion) to an image, with particular value for enhancing document scans. Morphological operations modify the structure of the image. Dilation expands the white (foreground) regions in a binary or grayscale image, while erosion shrinks them. These operations are beneficial in document processing, for example: - Dilation helps in closing up gaps within text or making thin lines thicker, enhancing legibility for OCR (Optical Character Recognition). - Erosion can remove small white noise and detach connected objects, making the structure of larger objects more pronounced.

Parameters

Name	Type	Default	Description
scale	One of: tuple[int, int] int	(2, 3)	Specifies the size of the structuring element (kernel) used for the operation. - If an integer is provided, a square kernel of that size will be used. - If a tuple or list is provided, it should contain two integers representing the minimum and maximum sizes for the dilation kernel.
operation	One of: 'erosion' 'dilation'	dilation	The morphological operation to apply. Default is 'dilation'.
p	float	0.5	The probability of applying this transformation. Default is 0.5.

Example

>>> import albumentations as A
>>> transform = A.Compose([
>>>     A.Morphological(scale=(2, 3), operation='dilation', p=0.5)
>>> ])
>>> image = transform(image=image)["image"]

References

Nougat: https://github.com/facebookresearch/nougat

MultiplicativeNoiseclass

Try it on Explore Albumentations

MultiplicativeNoise(
    multiplier: tuple[float, float] | float = (0.9, 1.1),
    per_channel: bool = False,
    elementwise: bool = False,
    p: float = 0.5
)

Apply multiplicative noise to the input image. This transform multiplies each pixel in the image by a random value or array of values, effectively creating a noise pattern that scales with the image intensity.

Parameters

Name	Type	Default	Description
multiplier	One of: tuple[float, float] float	(0.9, 1.1)	The range for the random multiplier. Defines the range from which the multiplier is sampled. Default: (0.9, 1.1)
per_channel	bool	False	If True, use a different random multiplier for each channel. If False, use the same multiplier for all channels. Setting this to False is slightly faster. Default: False
elementwise	bool	False	If True, generates a unique multiplier for each pixel. If False, generates a single multiplier (or one per channel if per_channel=True). Default: False
p	float	0.5	Probability of applying the transform. Default: 0.5

Example

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.MultiplicativeNoise(multiplier=(0.9, 1.1), per_channel=True, p=1.0)
>>> result = transform(image=image)
>>> noisy_image = result["image"]

Notes

- When elementwise=False and per_channel=False, a single multiplier is applied to the entire image. - When elementwise=False and per_channel=True, each channel gets a different multiplier. - When elementwise=True and per_channel=False, each pixel gets the same multiplier across all channels. - When elementwise=True and per_channel=True, each pixel in each channel gets a unique multiplier. - Setting per_channel=False is slightly faster, especially for larger images. - This transform can be used to simulate various lighting conditions or to create noise that scales with image intensity.

References

Multiplicative noise: https://en.wikipedia.org/wiki/Multiplicative_noise

NoiseParamsBaseclass

NoiseParamsBase(
    noise_type: str
)

Base class for all noise parameter models.

Parameters

Name	Type	Default	Description
noise_type	str	-	-

Normalizeclass

Try it on Explore Albumentations

Normalize(
    mean: tuple[float, ...] | float | None = (0.485, 0.456, 0.406),
    std: tuple[float, ...] | float | None = (0.229, 0.224, 0.225),
    max_pixel_value: float | None = 255.0,
    normalization: Literal['standard', 'image', 'image_per_channel', 'min_max', 'min_max_per_channel'] = standard,
    p: float = 1.0
)

Applies various normalization techniques to an image. The specific normalization technique can be selected with the `normalization` parameter. Standard normalization is applied using the formula: `img = (img - mean * max_pixel_value) / (std * max_pixel_value)`. Other normalization techniques adjust the image based on global or per-channel statistics, or scale pixel values to a specified range.

Parameters

Name	Type	Default	Description
mean	One of: tuple[float, ...] float None	(0.485, 0.456, 0.406)	Mean values for standard normalization. For "standard" normalization, the default values are ImageNet mean values: (0.485, 0.456, 0.406).
std	One of: tuple[float, ...] float None	(0.229, 0.224, 0.225)	Standard deviation values for standard normalization. For "standard" normalization, the default values are ImageNet standard deviation :(0.229, 0.224, 0.225).
max_pixel_value	One of: float None	255.0	Maximum possible pixel value, used for scaling in standard normalization. Defaults to 255.0.
normalization	One of: 'standard' 'image' 'image_per_channel' 'min_max' 'min_max_per_channel'	standard	Specifies the normalization technique to apply. Defaults to "standard". - "standard": Applies the formula `(img - mean * max_pixel_value) / (std * max_pixel_value)`. The default mean and std are based on ImageNet. You can use mean and std values of (0.5, 0.5, 0.5) for inception normalization. And mean values of (0, 0, 0) and std values of (1, 1, 1) for YOLO. - "image": Normalizes the whole image based on its global mean and standard deviation. - "image_per_channel": Normalizes the image per channel based on each channel's mean and standard deviation. - "min_max": Scales the image pixel values to a [0, 1] range based on the global minimum and maximum pixel values. - "min_max_per_channel": Scales each channel of the image pixel values to a [0, 1] range based on the per-channel minimum and maximum pixel values.
p	float	1.0	Probability of applying the transform. Defaults to 1.0.

Example

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> # Standard ImageNet normalization
>>> transform = A.Normalize(
...     mean=(0.485, 0.456, 0.406),
...     std=(0.229, 0.224, 0.225),
...     max_pixel_value=255.0,
...     p=1.0
... )
>>> normalized_image = transform(image=image)["image"]
>>>
>>> # Min-max normalization
>>> transform_minmax = A.Normalize(normalization="min_max", p=1.0)
>>> normalized_image_minmax = transform_minmax(image=image)["image"]

Notes

- For "standard" normalization, `mean`, `std`, and `max_pixel_value` must be provided. - For other normalization types, these parameters are ignored. - For inception normalization, use mean values of (0.5, 0.5, 0.5). - For YOLO normalization, use mean values of (0, 0, 0) and std values of (1, 1, 1). - This transform is often used as a final step in image preprocessing pipelines to prepare images for neural network input.

References

ImageNet mean and std: https://pytorch.org/vision/stable/models.html
Inception preprocessing: https://keras.io/api/applications/inceptionv3/

PlanckianJitterclass

Try it on Explore Albumentations

PlanckianJitter(
    mode: Literal['blackbody', 'cied'] = blackbody,
    temperature_limit: tuple[int, int] | None = None,
    sampling_method: Literal['uniform', 'gaussian'] = uniform,
    p: float = 0.5
)

Applies Planckian Jitter to the input image, simulating color temperature variations in illumination. This transform adjusts the color of an image to mimic the effect of different color temperatures of light sources, based on Planck's law of black body radiation. It can simulate the appearance of an image under various lighting conditions, from warm (reddish) to cool (bluish) color casts. PlanckianJitter vs. ColorJitter: PlanckianJitter is fundamentally different from ColorJitter in its approach and use cases: 1. Physics-based: PlanckianJitter is grounded in the physics of light, simulating real-world color temperature changes. ColorJitter applies arbitrary color adjustments. 2. Natural effects: This transform produces color shifts that correspond to natural lighting variations, making it ideal for outdoor scene simulation or color constancy problems. 3. Single parameter: Color changes are controlled by a single, physically meaningful parameter (color temperature), unlike ColorJitter's multiple abstract parameters. 4. Correlated changes: Color shifts are correlated across channels in a way that mimics natural light, whereas ColorJitter can make independent channel adjustments. When to use PlanckianJitter: - Simulating different times of day or lighting conditions in outdoor scenes - Augmenting data for computer vision tasks that need to be robust to natural lighting changes - Preparing synthetic data to better match real-world lighting variations - Color constancy research or applications - When you need physically plausible color variations rather than arbitrary color changes The logic behind PlanckianJitter: As the color temperature increases: 1. Lower temperatures (around 3000K) produce warm, reddish tones, simulating sunset or incandescent lighting. 2. Mid-range temperatures (around 5500K) correspond to daylight. 3. Higher temperatures (above 7000K) result in cool, bluish tones, similar to overcast sky or shade. This progression mimics the natural variation of sunlight throughout the day and in different weather conditions.

Parameters

Name	Type	Default	Description
mode	One of: 'blackbody' 'cied'	blackbody	The mode of the transformation. - "blackbody": Simulates blackbody radiation color changes. - "cied": Uses the CIE D illuminant series for color temperature simulation. Default: "blackbody"
temperature_limit	One of: tuple[int, int] None	None	The range of color temperatures (in Kelvin) to sample from. - For "blackbody" mode: Should be within [3000K, 15000K]. Default: (3000, 15000) - For "cied" mode: Should be within [4000K, 15000K]. Default: (4000, 15000) If None, the default ranges will be used based on the selected mode. Higher temperatures produce cooler (bluish) images, lower temperatures produce warmer (reddish) images.
sampling_method	One of: 'uniform' 'gaussian'	uniform	Method to sample the temperature. - "uniform": Samples uniformly across the specified range. - "gaussian": Samples from a Gaussian distribution centered at 6500K (approximate daylight). Default: "uniform"
p	float	0.5	Probability of applying the transform. Default: 0.5

Example

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)
>>> transform = A.PlanckianJitter(mode="blackbody",
...                               temperature_range=(3000, 9000),
...                               sampling_method="uniform",
...                               p=1.0)
>>> result = transform(image=image)
>>> jittered_image = result["image"]

Notes

- The transform preserves the overall brightness of the image while shifting its color. - The "blackbody" mode provides a wider range of color shifts, especially in the lower (warmer) temperatures. - The "cied" mode is based on standard illuminants and may provide more realistic daylight variations. - The Gaussian sampling method tends to produce more subtle variations, as it's centered around daylight. - Unlike ColorJitter, this transform ensures that color changes are physically plausible and correlated across channels, maintaining the natural appearance of the scene under different lighting conditions.

References

Planck's law: https://en.wikipedia.org/wiki/Planck%27s_law
CIE Standard Illuminants: https://en.wikipedia.org/wiki/Standard_illuminant
Color temperature: https://en.wikipedia.org/wiki/Color_temperature
Implementation inspired by: https://github.com/TheZino/PlanckianJitter

PlasmaBrightnessContrastclass

Try it on Explore Albumentations

PlasmaBrightnessContrast(
    brightness_range: tuple[float, float] = (-0.3, 0.3),
    contrast_range: tuple[float, float] = (-0.3, 0.3),
    plasma_size: int = 256,
    roughness: float = 3.0,
    p: float = 0.5
)

Apply plasma fractal pattern to modify image brightness and contrast. Uses Diamond-Square algorithm to generate organic-looking fractal patterns that create spatially-varying brightness and contrast adjustments.

Parameters

Name	Type	Default	Description
brightness_range	tuple[float, float]	(-0.3, 0.3)	Range for brightness adjustment strength. Values between -1 and 1: - Positive values increase brightness - Negative values decrease brightness - 0 means no brightness change Default: (-0.3, 0.3)
contrast_range	tuple[float, float]	(-0.3, 0.3)	Range for contrast adjustment strength. Values between -1 and 1: - Positive values increase contrast - Negative values decrease contrast - 0 means no contrast change Default: (-0.3, 0.3)
plasma_size	int	256	Size of the initial plasma pattern grid. Larger values create more detailed patterns but are slower to compute. The pattern will be resized to match the input image dimensions. Default: 256
roughness	float	3.0	Controls how quickly the noise amplitude increases at each iteration. Must be greater than 0: - Low values (< 1.0): Smoother, more gradual pattern - Medium values (~2.0): Natural-looking pattern - High values (> 3.0): Very rough, noisy pattern Default: 3.0
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- Works with any number of channels (grayscale, RGB, multispectral) - The same plasma pattern is applied to all channels - Operations are performed in float32 precision - Final values are clipped to valid range [0, max_value]

References

Fournier, Fussell, and Carpenter, "Computer rendering of stochastic models,": Communications of the ACM, 1982. Paper introducing the Diamond-Square algorithm.
Diamond-Square algorithm: https://en.wikipedia.org/wiki/Diamond-square_algorithm

PlasmaShadowclass

Try it on Explore Albumentations

PlasmaShadow(
    shadow_intensity_range: tuple[float, float] = (0.3, 0.7),
    plasma_size: int = 256,
    roughness: float = 3.0,
    p: float = 0.5
)

Apply plasma-based shadow effect to the image using Diamond-Square algorithm. Creates organic-looking shadows using plasma fractal noise pattern. The shadow intensity varies smoothly across the image, creating natural-looking darkening effects that can simulate shadows, shading, or lighting variations.

Parameters

Name	Type	Default	Description
shadow_intensity_range	tuple[float, float]	(0.3, 0.7)	Range for shadow intensity. Values between 0 and 1: - 0 means no shadow (original image) - 1 means maximum darkening (black) - Values between create partial shadows Default: (0.3, 0.7)
plasma_size	int	256	-
roughness	float	3.0	Controls how quickly the noise amplitude increases at each iteration. Must be greater than 0: - Low values (< 1.0): Smoother, more gradual shadows - Medium values (~2.0): Natural-looking shadows - High values (> 3.0): Very rough, noisy shadows Default: 3.0
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- The transform darkens the image using a plasma pattern - Works with any number of channels (grayscale, RGB, multispectral) - Shadow pattern is generated using Diamond-Square algorithm with specific kernels - The same shadow pattern is applied to all channels - Final values are clipped to valid range [0, max_value]

References

Fournier, Fussell, and Carpenter, "Computer rendering of stochastic models,": Communications of the ACM, 1982. Paper introducing the Diamond-Square algorithm.
Diamond-Square algorithm: https://en.wikipedia.org/wiki/Diamond-square_algorithm

Posterizeclass

Try it on Explore Albumentations

Posterize(
    num_bits: int | tuple[int, int] | list[tuple[int, int]] = 4,
    p: float = 0.5
)

Reduces the number of bits for each color channel in the image. This transform applies color posterization, a technique that reduces the number of distinct colors used in an image. It works by lowering the number of bits used to represent each color channel, effectively creating a "poster-like" effect with fewer color gradations.

Parameters

Name	Type	Default	Description
num_bits	One of: int tuple[int, int] list[tuple[int, int]]	4	Defines the number of bits to keep for each color channel. Can be specified in several ways: - Single int: Same number of bits for all channels. Range: [1, 7]. - tuple of two ints: (min_bits, max_bits) to randomly choose from. Range for each: [1, 7]. - list of three ints: Specific number of bits for each channel [r_bits, g_bits, b_bits]. - list of three tuples: Ranges for each channel [(r_min, r_max), (g_min, g_max), (b_min, b_max)]. Default: 4
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- The effect becomes more pronounced as the number of bits is reduced. - This transform can create interesting artistic effects or be used for image compression simulation. - Posterization is particularly useful for: * Creating stylized or retro-looking images * Reducing the color palette for specific artistic effects * Simulating the look of older or lower-quality digital images * Data augmentation in scenarios where color depth might vary

References

Color Quantization: https://en.wikipedia.org/wiki/Color_quantization
Posterization: https://en.wikipedia.org/wiki/Posterization

RGBShiftclass

Try it on Explore Albumentations

RGBShift(
    r_shift_limit: tuple[float, float] | float = (-20, 20),
    g_shift_limit: tuple[float, float] | float = (-20, 20),
    b_shift_limit: tuple[float, float] | float = (-20, 20),
    p: float = 0.5
)

Randomly shift values for each channel of the input RGB image. A specialized version of AdditiveNoise that applies constant uniform shifts to RGB channels. Each channel (R,G,B) can have its own shift range specified.

Parameters

Name	Type	Default	Description
r_shift_limit	One of: tuple[float, float] float	(-20, 20)	Range for shifting the red channel. Options: - If tuple (min, max): Sample shift value from this range - If int: Sample shift value from (-r_shift_limit, r_shift_limit) - For uint8 images: Values represent absolute shifts in [0, 255] - For float images: Values represent relative shifts in [0, 1] Default: (-20, 20)
g_shift_limit	One of: tuple[float, float] float	(-20, 20)	Range for shifting the green channel. Options: - If tuple (min, max): Sample shift value from this range - If int: Sample shift value from (-g_shift_limit, g_shift_limit) - For uint8 images: Values represent absolute shifts in [0, 255] - For float images: Values represent relative shifts in [0, 1] Default: (-20, 20)
b_shift_limit	One of: tuple[float, float] float	(-20, 20)	Range for shifting the blue channel. Options: - If tuple (min, max): Sample shift value from this range - If int: Sample shift value from (-b_shift_limit, b_shift_limit) - For uint8 images: Values represent absolute shifts in [0, 255] - For float images: Values represent relative shifts in [0, 1] Default: (-20, 20)
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- Values are shifted independently for each channel - For uint8 images: * Input ranges like (-20, 20) represent pixel value shifts * A shift of 20 means adding 20 to that channel * Final values are clipped to [0, 255] - For float32 images: * Input ranges like (-0.1, 0.1) represent relative shifts * A shift of 0.1 means adding 0.1 to that channel * Final values are clipped to [0, 1]

RandomBrightnessContrastclass

Try it on Explore Albumentations

RandomBrightnessContrast(
    brightness_limit: tuple[float, float] | float = (-0.2, 0.2),
    contrast_limit: tuple[float, float] | float = (-0.2, 0.2),
    brightness_by_max: bool = True,
    ensure_safe_range: bool = False,
    p: float = 0.5
)

Randomly changes the brightness and contrast of the input image. This transform adjusts the brightness and contrast of an image simultaneously, allowing for a wide range of lighting and contrast variations. It's particularly useful for data augmentation in computer vision tasks, helping models become more robust to different lighting conditions.

Parameters

Name	Type	Default	Description
brightness_limit	One of: tuple[float, float] float	(-0.2, 0.2)	Factor range for changing brightness. If a single float value is provided, the range will be (-brightness_limit, brightness_limit). Values should typically be in the range [-1.0, 1.0], where 0 means no change, 1.0 means maximum brightness, and -1.0 means minimum brightness. Default: (-0.2, 0.2).
contrast_limit	One of: tuple[float, float] float	(-0.2, 0.2)	Factor range for changing contrast. If a single float value is provided, the range will be (-contrast_limit, contrast_limit). Values should typically be in the range [-1.0, 1.0], where 0 means no change, 1.0 means maximum increase in contrast, and -1.0 means maximum decrease in contrast. Default: (-0.2, 0.2).
brightness_by_max	bool	True	If True, adjusts brightness by scaling pixel values up to the maximum value of the image's dtype. If False, uses the mean pixel value for adjustment. Default: True.
ensure_safe_range	bool	False	If True, adjusts alpha and beta to prevent overflow/underflow. This ensures output values stay within the valid range for the image dtype without clipping. Default: False.
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- The order of operation is: contrast adjustment, then brightness adjustment. - For uint8 images, the output is clipped to [0, 255] range. - For float32 images, the output is clipped to [0, 1] range. - The `brightness_by_max` parameter affects how brightness is adjusted: * If True, brightness adjustment is more pronounced and can lead to more saturated results. * If False, brightness adjustment is more subtle and preserves the overall lighting better. - This transform is useful for: * Simulating different lighting conditions * Enhancing low-light or overexposed images * Data augmentation to improve model robustness

References

Brightness: https://en.wikipedia.org/wiki/Brightness
Contrast: https://en.wikipedia.org/wiki/Contrast_(vision)

RandomFogclass

Try it on Explore Albumentations

RandomFog(
    alpha_coef: float = 0.08,
    fog_coef_range: tuple[float, float] = (0.3, 1),
    p: float = 0.5
)

Simulates fog for the image by adding random fog-like artifacts. This transform creates a fog effect by generating semi-transparent overlays that mimic the visual characteristics of fog. The fog intensity and distribution can be controlled to create various fog-like conditions.

Parameters

Name	Type	Default	Description
alpha_coef	float	0.08	Transparency of the fog circles. Should be in [0, 1] range. Default: 0.08.
fog_coef_range	tuple[float, float]	(0.3, 1)	Range for fog intensity coefficient. Should be in [0, 1] range.
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- The fog effect is created by overlaying semi-transparent circles on the image. - Higher fog coefficient values result in denser fog effects. - The fog is typically denser in the center of the image and gradually decreases towards the edges. - This transform is useful for: * Simulating various weather conditions in outdoor scenes * Data augmentation for improving model robustness to foggy conditions * Creating atmospheric effects in image editing

References

Fog: https://en.wikipedia.org/wiki/Fog
Atmospheric perspective: https://en.wikipedia.org/wiki/Aerial_perspective

RandomGammaclass

Try it on Explore Albumentations

RandomGamma(
    gamma_limit: tuple[float, float] | float = (80, 120),
    p: float = 0.5
)

Applies random gamma correction to the input image. Gamma correction, or simply gamma, is a nonlinear operation used to encode and decode luminance or tristimulus values in imaging systems. This transform can adjust the brightness of an image while preserving the relative differences between darker and lighter areas, making it useful for simulating different lighting conditions or correcting for display characteristics.

Parameters

Name	Type	Default	Description
gamma_limit	One of: tuple[float, float] float	(80, 120)	If gamma_limit is a single float value, the range will be (1, gamma_limit). If it's a tuple of two floats, they will serve as the lower and upper bounds for gamma adjustment. Values are in terms of percentage change, e.g., (80, 120) means the gamma will be between 80% and 120% of the original. Default: (80, 120).
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- The gamma correction is applied using the formula: output = input^gamma - Gamma values > 1 will make the image darker, while values < 1 will make it brighter - This transform is particularly useful for: * Simulating different lighting conditions * Correcting for non-linear display characteristics * Enhancing contrast in certain regions of the image * Data augmentation in computer vision tasks

References

Gamma correction: https://en.wikipedia.org/wiki/Gamma_correction
Power law (Gamma) encoding: https://www.cambridgeincolour.com/tutorials/gamma-correction.htm

RandomGravelclass

Try it on Explore Albumentations

RandomGravel(
    gravel_roi: tuple[float, float, float, float] = (0.1, 0.4, 0.9, 0.9),
    number_of_patches: int = 2,
    p: float = 0.5
)

Adds gravel-like artifacts to the input image. This transform simulates the appearance of gravel or small stones scattered across specific regions of an image. It's particularly useful for augmenting datasets of road or terrain images, adding realistic texture variations.

Parameters

Name	Type	Default	Description
gravel_roi	tuple[float, float, float, float]	(0.1, 0.4, 0.9, 0.9)	Region of interest where gravel will be added, specified as (x_min, y_min, x_max, y_max) in relative coordinates [0, 1]. Default: (0.1, 0.4, 0.9, 0.9).
number_of_patches	int	2	Number of gravel patch regions to generate within the ROI. Each patch will contain multiple gravel particles. Default: 2.
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- The gravel effect is created by modifying the saturation channel in the HLS color space. - Gravel particles are distributed within randomly generated patches inside the specified ROI. - This transform is particularly useful for: * Augmenting datasets for road condition analysis * Simulating variations in terrain for computer vision tasks * Adding realistic texture to synthetic images of outdoor scenes

References

Road surface textures: https://en.wikipedia.org/wiki/Road_surface
HLS color space: https://en.wikipedia.org/wiki/HSL_and_HSV

RandomRainclass

Try it on Explore Albumentations

RandomRain(
    slant_range: tuple[float, float] = (-10, 10),
    drop_length: int | None = None,
    drop_width: int = 1,
    drop_color: tuple[int, int, int] = (200, 200, 200),
    blur_value: int = 7,
    brightness_coefficient: float = 0.7,
    rain_type: Literal['drizzle', 'heavy', 'torrential', 'default'] = default,
    p: float = 0.5
)

Adds rain effects to an image. This transform simulates rainfall by overlaying semi-transparent streaks onto the image, creating a realistic rain effect. It can be used to augment datasets for computer vision tasks that need to perform well in rainy conditions.

Parameters

Name	Type	Default	Description
slant_range	tuple[float, float]	(-10, 10)	Range for the rain slant angle in degrees. Negative values slant to the left, positive to the right. Default: (-10, 10).
drop_length	One of: int None	None	Length of the rain drops in pixels. If None, drop length will be automatically calculated as height // 8. This allows the rain effect to scale with the image size. Default: None
drop_width	int	1	Width of the rain drops in pixels. Default: 1.
drop_color	tuple[int, int, int]	(200, 200, 200)	Color of the rain drops in RGB format. Default: (200, 200, 200).
blur_value	int	7	Blur value for simulating rain effect. Rainy views are typically blurry. Default: 7.
brightness_coefficient	float	0.7	Coefficient to adjust the brightness of the image. Rainy scenes are usually darker. Should be in the range (0, 1]. Default: 0.7.
rain_type	One of: 'drizzle' 'heavy' 'torrential' 'default'	default	Type of rain to simulate.
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- The rain effect is created by drawing semi-transparent lines on the image. - The slant of the rain can be controlled to simulate wind effects. - Different rain types (drizzle, heavy, torrential) adjust the density and appearance of the rain. - The transform also adjusts image brightness and applies a blur to simulate the visual effects of rain. - This transform is particularly useful for: * Augmenting datasets for autonomous driving in rainy conditions * Testing the robustness of computer vision models to weather effects * Creating realistic rainy scenes for image editing or film production

References

Rain visualization techniques: https://developer.nvidia.com/gpugems/gpugems3/part-iv-image-effects/chapter-27-real-time-rain-rendering
Weather effects in computer vision: https://www.sciencedirect.com/science/article/pii/S1077314220300692

RandomShadowclass

Try it on Explore Albumentations

RandomShadow(
    shadow_roi: tuple[float, float, float, float] = (0, 0.5, 1, 1),
    num_shadows_limit: tuple[int, int] = (1, 2),
    shadow_dimension: int = 5,
    shadow_intensity_range: tuple[float, float] = (0.5, 0.5),
    p: float = 0.5
)

Simulates shadows for the image by reducing the brightness of the image in shadow regions. This transform adds realistic shadow effects to images, which can be useful for augmenting datasets for outdoor scene analysis, autonomous driving, or any computer vision task where shadows may be present.

Parameters

Name	Type	Default	Description
shadow_roi	tuple[float, float, float, float]	(0, 0.5, 1, 1)	Region of the image where shadows will appear (x_min, y_min, x_max, y_max). All values should be in range [0, 1]. Default: (0, 0.5, 1, 1).
num_shadows_limit	tuple[int, int]	(1, 2)	Lower and upper limits for the possible number of shadows. Default: (1, 2).
shadow_dimension	int	5	Number of edges in the shadow polygons. Default: 5.
shadow_intensity_range	tuple[float, float]	(0.5, 0.5)	Range for the shadow intensity. Larger value means darker shadow. Should be two float values between 0 and 1. Default: (0.5, 0.5).
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- Shadows are created by generating random polygons within the specified ROI and reducing the brightness of the image in these areas. - The number of shadows, their shapes, and intensities can be randomized for variety. - This transform is particularly useful for: * Augmenting datasets for outdoor scene understanding * Improving robustness of object detection models to shadowed conditions * Simulating different lighting conditions in synthetic datasets

References

Shadow detection and removal: https://www.sciencedirect.com/science/article/pii/S1047320315002035
Shadows in computer vision: https://en.wikipedia.org/wiki/Shadow_detection

RandomSnowclass

Try it on Explore Albumentations

RandomSnow(
    brightness_coeff: float = 2.5,
    snow_point_range: tuple[float, float] = (0.1, 0.3),
    method: Literal['bleach', 'texture'] = bleach,
    p: float = 0.5
)

Applies a random snow effect to the input image. This transform simulates snowfall by either bleaching out some pixel values or adding a snow texture to the image, depending on the chosen method.

Parameters

Name	Type	Default	Description
brightness_coeff	float	2.5	Coefficient applied to increase the brightness of pixels below the snow_point threshold. Larger values lead to more pronounced snow effects. Should be > 0. Default: 2.5.
snow_point_range	tuple[float, float]	(0.1, 0.3)	Range for the snow point threshold. Both values should be in the (0, 1) range. Default: (0.1, 0.3).
method	One of: 'bleach' 'texture'	bleach	The snow simulation method to use. Options are: - "bleach": Uses a simple pixel value thresholding technique. - "texture": Applies a more realistic snow texture overlay. Default: "texture".
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- The "bleach" method increases the brightness of pixels above a certain threshold, creating a simple snow effect. This method is faster but may look less realistic. - The "texture" method creates a more realistic snow effect through the following steps: 1. Converts the image to HSV color space for better control over brightness. 2. Increases overall image brightness to simulate the reflective nature of snow. 3. Generates a snow texture using Gaussian noise, which is then smoothed with a Gaussian filter. 4. Applies a depth effect to the snow texture, making it more prominent at the top of the image. 5. Blends the snow texture with the original image using alpha compositing. 6. Adds a slight blue tint to simulate the cool color of snow. 7. Adds random sparkle effects to simulate light reflecting off snow crystals. This method produces a more realistic result but is computationally more expensive.

References

Bleach method: https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library
Texture method: Inspired by computer graphics techniques for snow rendering and atmospheric scattering simulations.

RandomSunFlareclass

Try it on Explore Albumentations

RandomSunFlare(
    flare_roi: tuple[float, float, float, float] = (0, 0, 1, 0.5),
    src_radius: int = 400,
    src_color: tuple[int, ...] = (255, 255, 255),
    angle_range: tuple[float, float] = (0, 1),
    num_flare_circles_range: tuple[int, int] = (6, 10),
    method: Literal['overlay', 'physics_based'] = overlay,
    p: float = 0.5
)

Simulates a sun flare effect on the image by adding circles of light. This transform creates a sun flare effect by overlaying multiple semi-transparent circles of varying sizes and intensities along a line originating from a "sun" point. It offers two methods: a simple overlay technique and a more complex physics-based approach.

Parameters

Name	Type	Default	Description
flare_roi	tuple[float, float, float, float]	(0, 0, 1, 0.5)	Region of interest where the sun flare can appear. Values are in the range [0, 1] and represent (x_min, y_min, x_max, y_max) in relative coordinates. Default: (0, 0, 1, 0.5).
src_radius	int	400	Radius of the sun circle in pixels. Default: 400.
src_color	tuple[int, ...]	(255, 255, 255)	Color of the sun in RGB format. Default: (255, 255, 255).
angle_range	tuple[float, float]	(0, 1)	Range of angles (in radians) for the flare direction. Values should be in the range [0, 1], where 0 represents 0 radians and 1 represents 2π radians. Default: (0, 1).
num_flare_circles_range	tuple[int, int]	(6, 10)	Range for the number of flare circles to generate. Default: (6, 10).
method	One of: 'overlay' 'physics_based'	overlay	Method to use for generating the sun flare. "overlay" uses a simple alpha blending technique, while "physics_based" simulates more realistic optical phenomena. Default: "overlay".
p	float	0.5	Probability of applying the transform. Default: 0.5.

References

Lens flare: https://en.wikipedia.org/wiki/Lens_flare
Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing
Diffraction: https://en.wikipedia.org/wiki/Diffraction
Chromatic aberration: https://en.wikipedia.org/wiki/Chromatic_aberration
Screen blending: https://en.wikipedia.org/wiki/Blend_modes#Screen

RandomToneCurveclass

Try it on Explore Albumentations

RandomToneCurve(
    scale: float = 0.1,
    per_channel: bool = False,
    p: float = 0.5
)

Randomly change the relationship between bright and dark areas of the image by manipulating its tone curve. This transform applies a random S-curve to the image's tone curve, adjusting the brightness and contrast in a non-linear manner. It can be applied to the entire image or to each channel separately.

Parameters

Name	Type	Default	Description
scale	float	0.1	Standard deviation of the normal distribution used to sample random distances to move two control points that modify the image's curve. Values should be in range [0, 1]. Higher values will result in more dramatic changes to the image. Default: 0.1
per_channel	bool	False	If True, the tone curve will be applied to each channel of the input image separately, which can lead to color distortion. If False, the same curve is applied to all channels, preserving the original color relationships. Default: False
p	float	0.5	Probability of applying the transform. Default: 0.5

Notes

- This transform modifies the image's histogram by applying a smooth, S-shaped curve to it. - The S-curve is defined by moving two control points of a quadratic Bézier curve. - When per_channel is False, the same curve is applied to all channels, maintaining color balance. - When per_channel is True, different curves are applied to each channel, which can create color shifts. - This transform can be used to adjust image contrast and brightness in a more natural way than linear transforms. - The effect can range from subtle contrast adjustments to more dramatic "vintage" or "faded" looks.

References

"What Else Can Fool Deep Learning? Addressing Color Constancy Errors on Deep Neural Network Performance": https://arxiv.org/abs/1912.06960
Bézier curve: https://en.wikipedia.org/wiki/B%C3%A9zier_curve#Quadratic_B%C3%A9zier_curves
Tone mapping: https://en.wikipedia.org/wiki/Tone_mapping

RingingOvershootclass

Try it on Explore Albumentations

RingingOvershoot(
    blur_limit: tuple[int, int] | int = (7, 15),
    cutoff: tuple[float, float] = (0.7853981633974483, 1.5707963267948966),
    p: float = 0.5
)

Create ringing or overshoot artifacts by convolving the image with a 2D sinc filter. This transform simulates the ringing artifacts that can occur in digital image processing, particularly after sharpening or edge enhancement operations. It creates oscillations or overshoots near sharp transitions in the image.

Parameters

Name	Type	Default	Description
blur_limit	One of: tuple[int, int] int	(7, 15)	Maximum kernel size for the sinc filter. Must be an odd number in the range [3, inf). If a single int is provided, the kernel size will be randomly chosen from the range (3, blur_limit). If a tuple (min, max) is provided, the kernel size will be randomly chosen from the range (min, max). Default: (7, 15).
cutoff	tuple[float, float]	(0.7853981633974483, 1.5707963267948966)	Range to choose the cutoff frequency in radians. Values should be in the range (0, π). A lower cutoff frequency will result in more pronounced ringing effects. Default: (π/4, π/2).
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- Ringing artifacts are oscillations of the image intensity function in the neighborhood of sharp transitions, such as edges or object boundaries. - This transform uses a 2D sinc filter (also known as a 2D cardinal sine function) to introduce these artifacts. - The severity of the ringing effect is controlled by both the kernel size (blur_limit) and the cutoff frequency. - Larger kernel sizes and lower cutoff frequencies will generally produce more noticeable ringing effects. - This transform can be useful for: * Simulating imperfections in image processing or transmission systems * Testing the robustness of computer vision models to ringing artifacts * Creating artistic effects that emphasize edges and transitions in images

References

Ringing artifacts: https://en.wikipedia.org/wiki/Ringing_artifacts
Sinc filter: https://en.wikipedia.org/wiki/Sinc_filter
Digital Image Processing: Rafael C. Gonzalez and Richard E. Woods, 4th Edition

SaltAndPepperclass

Try it on Explore Albumentations

SaltAndPepper(
    amount: tuple[float, float] = (0.01, 0.06),
    salt_vs_pepper: tuple[float, float] = (0.4, 0.6),
    p: float = 0.5
)

Apply salt and pepper noise to the input image. Salt and pepper noise is a form of impulse noise that randomly sets pixels to either maximum value (salt) or minimum value (pepper). The amount and proportion of salt vs pepper noise can be controlled. The same noise mask is applied to all channels of the image to preserve color consistency.

Parameters

Name	Type	Default	Description
amount	tuple[float, float]	(0.01, 0.06)	Range for total amount of noise (both salt and pepper). Values between 0 and 1. For example: - 0.05 means 5% of all pixels will be replaced with noise - (0.01, 0.06) will sample amount uniformly from 1% to 6% Default: (0.01, 0.06)
salt_vs_pepper	tuple[float, float]	(0.4, 0.6)	Range for ratio of salt (white) vs pepper (black) noise. Values between 0 and 1. For example: - 0.5 means equal amounts of salt and pepper - 0.7 means 70% of noisy pixels will be salt, 30% pepper - (0.4, 0.6) will sample ratio uniformly from 40% to 60% Default: (0.4, 0.6)
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- Salt noise sets pixels to maximum value (255 for uint8, 1.0 for float32) - Pepper noise sets pixels to 0 - The noise mask is generated once and applied to all channels to maintain color consistency (i.e., if a pixel is set to salt, all its color channels will be set to maximum value) - The exact number of affected pixels matches the specified amount as masks are generated without overlap

References

Digital Image Processing: Rafael C. Gonzalez and Richard E. Woods, 4th Edition, Chapter 5: Image Restoration and Reconstruction.
Fundamentals of Digital Image Processing: A. K. Jain, Chapter 7: Image Degradation and Restoration.
Salt and pepper noise: https://en.wikipedia.org/wiki/Salt-and-pepper_noise

Sharpenclass

Try it on Explore Albumentations

Sharpen(
    alpha: tuple[float, float] = (0.2, 0.5),
    lightness: tuple[float, float] = (0.5, 1.0),
    method: Literal['kernel', 'gaussian'] = kernel,
    kernel_size: int = 5,
    sigma: float = 1.0,
    p: float = 0.5
)

Sharpen the input image using either kernel-based or Gaussian interpolation method. Implements two different approaches to image sharpening: 1. Traditional kernel-based method using Laplacian operator 2. Gaussian interpolation method (similar to Kornia's approach)

Parameters

Name	Type	Default	Description
alpha	tuple[float, float]	(0.2, 0.5)	Range for the visibility of sharpening effect. At 0, only the original image is visible, at 1.0 only its processed version is visible. Values should be in the range [0, 1]. Used in both methods. Default: (0.2, 0.5).
lightness	tuple[float, float]	(0.5, 1.0)	Range for the lightness of the sharpened image. Only used in 'kernel' method. Larger values create higher contrast. Values should be greater than 0. Default: (0.5, 1.0).
method	One of: 'kernel' 'gaussian'	kernel	Sharpening algorithm to use: - 'kernel': Traditional kernel-based sharpening using Laplacian operator - 'gaussian': Interpolation between Gaussian blurred and original image Default: 'kernel'
kernel_size	int	5	Size of the Gaussian blur kernel for 'gaussian' method. Must be odd. Default: 5
sigma	float	1.0	Standard deviation for Gaussian kernel in 'gaussian' method. Default: 1.0
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- Kernel sizes must be odd to maintain spatial alignment - Methods produce different visual results: * Kernel method: More pronounced edges, possible artifacts * Gaussian method: More natural look, limited to original sharpness

References

R. C. Gonzalez and R. E. Woods, "Digital Image Processing (4th Edition),": Chapter 3: Intensity Transformations and Spatial Filtering.
J. C. Russ, "The Image Processing Handbook (7th Edition),": Chapter 4: Image Enhancement.
T. Acharya and A. K. Ray, "Image Processing: Principles and Applications,": Chapter 5: Image Enhancement.
Unsharp masking: https://en.wikipedia.org/wiki/Unsharp_masking
Laplacian operator: https://en.wikipedia.org/wiki/Laplace_operator
Gaussian blur: https://en.wikipedia.org/wiki/Gaussian_blur

ShotNoiseclass

Try it on Explore Albumentations

ShotNoise(
    scale_range: tuple[float, float] = (0.1, 0.3),
    p: float = 0.5
)

Apply shot noise to the image by modeling photon counting as a Poisson process. Shot noise (also known as Poisson noise) occurs in imaging due to the quantum nature of light. When photons hit an imaging sensor, they arrive at random times following Poisson statistics. This transform simulates this physical process in linear light space by: 1. Converting to linear space (removing gamma) 2. Treating each pixel value as an expected photon count 3. Sampling actual photon counts from a Poisson distribution 4. Converting back to display space (reapplying gamma) The noise characteristics follow real camera behavior: - Noise variance equals signal mean in linear space (Poisson statistics) - Brighter regions have more absolute noise but less relative noise - Darker regions have less absolute noise but more relative noise - Noise is generated independently for each pixel and color channel

Parameters

Name	Type	Default	Description
scale_range	tuple[float, float]	(0.1, 0.3)	Range for sampling the noise scale factor. Represents the reciprocal of the expected photon count per unit intensity. Higher values mean more noise: - scale = 0.1: ~100 photons per unit intensity (low noise) - scale = 1.0: ~1 photon per unit intensity (moderate noise) - scale = 10.0: ~0.1 photons per unit intensity (high noise) Default: (0.1, 0.3)
p	float	0.5	Probability of applying the transform. Default: 0.5

Example

>>> import numpy as np
>>> import albumentations as A
>>> # Generate synthetic image
>>> image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)
>>> # Apply moderate shot noise
>>> transform = A.ShotNoise(scale_range=(0.1, 1.0), p=1.0)
>>> noisy_image = transform(image=image)["image"]

Notes

- Performs calculations in linear light space (gamma = 2.2) - Preserves the image's mean intensity - Memory efficient with in-place operations - Thread-safe with independent random seeds

References

Shot noise: https://en.wikipedia.org/wiki/Shot_noise
Original paper: https://doi.org/10.1002/andp.19183622304 (Schottky, 1918)
Poisson process: https://en.wikipedia.org/wiki/Poisson_point_process
Gamma correction: https://en.wikipedia.org/wiki/Gamma_correction

Solarizeclass

Try it on Explore Albumentations

Solarize(
    threshold_range: tuple[float, float] = (0.5, 0.5),
    p: float = 0.5
)

Invert all pixel values above a threshold. This transform applies a solarization effect to the input image. Solarization is a phenomenon in photography in which the image recorded on a negative or on a photographic print is wholly or partially reversed in tone. Dark areas appear light or light areas appear dark. In this implementation, all pixel values above a threshold are inverted.

Parameters

Name	Type	Default	Description
threshold_range	tuple[float, float]	(0.5, 0.5)	Range for solarizing threshold as a fraction of maximum value. The threshold_range should be in the range [0, 1] and will be multiplied by the maximum value of the image type (255 for uint8 images or 1.0 for float images). Default: (0.5, 0.5) (corresponds to 127.5 for uint8 and 0.5 for float32).
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- For uint8 images, pixel values above the threshold are inverted as: 255 - pixel_value - For float32 images, pixel values above the threshold are inverted as: 1.0 - pixel_value - The threshold is applied to each channel independently - The threshold is calculated in two steps: 1. Sample a value from threshold_range 2. Multiply by the image's maximum value: * For uint8: threshold = sampled_value * 255 * For float32: threshold = sampled_value * 1.0 - This transform can create interesting artistic effects or be used for data augmentation

Spatterclass

Try it on Explore Albumentations

Spatter(
    mean: tuple[float, float] | float = (0.65, 0.65),
    std: tuple[float, float] | float = (0.3, 0.3),
    gauss_sigma: tuple[float, float] | float = (2, 2),
    cutout_threshold: tuple[float, float] | float = (0.68, 0.68),
    intensity: tuple[float, float] | float = (0.6, 0.6),
    mode: Literal['rain', 'mud'] = rain,
    color: tuple[int, ...] | None = None,
    p: float = 0.5
)

Apply spatter transform. It simulates corruption which can occlude a lens in the form of rain or mud.

Parameters

Name	Type	Default	Description
mean	One of: tuple[float, float] float	(0.65, 0.65)	Mean value of normal distribution for generating liquid layer. If single float mean will be sampled from `(0, mean)` If tuple of float mean will be sampled from range `(mean[0], mean[1])`. If you want constant value use (mean, mean). Default (0.65, 0.65)
std	One of: tuple[float, float] float	(0.3, 0.3)	Standard deviation value of normal distribution for generating liquid layer. If single float the number will be sampled from `(0, std)`. If tuple of float std will be sampled from range `(std[0], std[1])`. If you want constant value use (std, std). Default: (0.3, 0.3).
gauss_sigma	One of: tuple[float, float] float	(2, 2)	Sigma value for gaussian filtering of liquid layer. If single float the number will be sampled from `(0, gauss_sigma)`. If tuple of float gauss_sigma will be sampled from range `(gauss_sigma[0], gauss_sigma[1])`. If you want constant value use (gauss_sigma, gauss_sigma). Default: (2, 3).
cutout_threshold	One of: tuple[float, float] float	(0.68, 0.68)	Threshold for filtering liquid layer (determines number of drops). If single float it will used as cutout_threshold. If single float the number will be sampled from `(0, cutout_threshold)`. If tuple of float cutout_threshold will be sampled from range `(cutout_threshold[0], cutout_threshold[1])`. If you want constant value use `(cutout_threshold, cutout_threshold)`. Default: (0.68, 0.68).
intensity	One of: tuple[float, float] float	(0.6, 0.6)	Intensity of corruption. If single float the number will be sampled from `(0, intensity)`. If tuple of float intensity will be sampled from range `(intensity[0], intensity[1])`. If you want constant value use `(intensity, intensity)`. Default: (0.6, 0.6).
mode	One of: 'rain' 'mud'	rain	Type of corruption. Default: "rain".
color	One of: tuple[int, ...] None	None	Corruption elements color. If list uses provided list as color for the effect. If None uses default colors based on mode (rain: (238, 238, 175), mud: (20, 42, 63)).
p	float	0.5	probability of applying the transform. Default: 0.5.

References

Benchmarking Neural Network Robustness to Common Corruptions and Perturbations: https://arxiv.org/abs/1903.12261

Superpixelsclass

Try it on Explore Albumentations

Superpixels(
    p_replace: tuple[float, float] | float = (0, 0.1),
    n_segments: tuple[int, int] | int = (100, 100),
    max_size: int | None = 128,
    interpolation: Literal[cv2.INTER_NEAREST, cv2.INTER_NEAREST_EXACT, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4, cv2.INTER_LINEAR_EXACT] = 1,
    p: float = 0.5
)

Transform images partially/completely to their superpixel representation.

Parameters

Name	Type	Default	Description
p_replace	One of: tuple[float, float] float	(0, 0.1)	Defines for any segment the probability that the pixels within that segment are replaced by their average color (otherwise, the pixels are not changed). * A probability of ``0.0`` would mean, that the pixels in no segment are replaced by their average color (image is not changed at all). * A probability of ``0.5`` would mean, that around half of all segments are replaced by their average color. * A probability of ``1.0`` would mean, that all segments are replaced by their average color (resulting in a voronoi image). Behavior based on chosen data types for this parameter: * If a ``float``, then that ``float`` will always be used. * If ``tuple`` ``(a, b)``, then a random probability will be sampled from the interval ``[a, b]`` per image. Default: (0.1, 0.3)
n_segments	One of: tuple[int, int] int	(100, 100)	Rough target number of how many superpixels to generate. The algorithm may deviate from this number. Lower value will lead to coarser superpixels. Higher values are computationally more intensive and will hence lead to a slowdown. If tuple ``(a, b)``, then a value from the discrete interval ``[a..b]`` will be sampled per image. Default: (15, 120)
max_size	One of: int None	128	Maximum image size at which the augmentation is performed. If the width or height of an image exceeds this value, it will be downscaled before the augmentation so that the longest side matches `max_size`. This is done to speed up the process. The final output image has the same size as the input image. Note that in case `p_replace` is below ``1.0``, the down-/upscaling will affect the not-replaced pixels too. Use ``None`` to apply no down-/upscaling. Default: 128
interpolation	One of: cv2.INTER_NEAREST cv2.INTER_NEAREST_EXACT cv2.INTER_LINEAR cv2.INTER_CUBIC cv2.INTER_AREA cv2.INTER_LANCZOS4 cv2.INTER_LINEAR_EXACT	1	Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- This transform can significantly change the visual appearance of the image. - The transform makes use of a superpixel algorithm, which tends to be slow. If performance is a concern, consider using `max_size` to limit the image size. - The effect of this transform can vary greatly depending on the `p_replace` and `n_segments` parameters. - When `p_replace` is high, the image can become highly abstracted, resembling a voronoi diagram. - The transform preserves the original image type (uint8 or float32).

ToFloatclass

Try it on Explore Albumentations

ToFloat(
    max_value: float | None = None,
    p: float = 1.0
)

Convert the input image to a floating-point representation. This transform divides pixel values by `max_value` to get a float32 output array where all values lie in the range [0, 1.0]. It's useful for normalizing image data before feeding it into neural networks or other algorithms that expect float input.

Parameters

Name	Type	Default	Description
max_value	One of: float None	None	The maximum possible input value. If None, the transform will try to infer the maximum value by inspecting the data type of the input image: - uint8: 255 - uint16: 65535 - uint32: 4294967295 - float32: 1.0 Default: None.
p	float	1.0	Probability of applying the transform. Default: 1.0.

Returns

np.ndarray: Image in floating point representation, with values in range [0, 1.0].

Notes

- If the input image is already float32 with values in [0, 1], it will be returned unchanged. - For integer types (uint8, uint16, uint32), the function will scale the values to [0, 1] range. - The output will always be float32, regardless of the input type. - This transform is often used as a preprocessing step before applying other transformations or feeding the image into a neural network.

ToGrayclass

Try it on Explore Albumentations

ToGray(
    num_output_channels: int = 3,
    method: Literal['weighted_average', 'from_lab', 'desaturation', 'average', 'max', 'pca'] = weighted_average,
    p: float = 0.5
)

Convert an image to grayscale and optionally replicate the grayscale channel. This transform first converts a color image to a single-channel grayscale image using various methods, then replicates the grayscale channel if num_output_channels is greater than 1.

Parameters

Name	Type	Default	Description
num_output_channels	int	3	The number of channels in the output image. If greater than 1, the grayscale channel will be replicated. Default: 3.
method	One of: 'weighted_average' 'from_lab' 'desaturation' 'average' 'max' 'pca'	weighted_average	The method used for grayscale conversion: - "weighted_average": Uses a weighted sum of RGB channels (0.299R + 0.587G + 0.114B). Works only with 3-channel images. Provides realistic results based on human perception. - "from_lab": Extracts the L channel from the LAB color space. Works only with 3-channel images. Gives perceptually uniform results. - "desaturation": Averages the maximum and minimum values across channels. Works with any number of channels. Fast but may not preserve perceived brightness well. - "average": Simple average of all channels. Works with any number of channels. Fast but may not give realistic results. - "max": Takes the maximum value across all channels. Works with any number of channels. Tends to produce brighter results. - "pca": Applies Principal Component Analysis to reduce channels. Works with any number of channels. Can preserve more information but is computationally intensive.
p	float	0.5	Probability of applying the transform. Default: 0.5.

Returns

np.ndarray: Grayscale image with the specified number of channels.

Notes

- The transform first converts the input image to single-channel grayscale, then replicates this channel if num_output_channels > 1. - "weighted_average" and "from_lab" are typically used in image processing and computer vision applications where accurate representation of human perception is important. - "desaturation" and "average" are often used in simple image manipulation tools or when computational speed is a priority. - "max" method can be useful in scenarios where preserving bright features is important, such as in some medical imaging applications. - "pca" might be used in advanced image analysis tasks or when dealing with hyperspectral images.

ToRGBclass

Try it on Explore Albumentations

ToRGB(
    num_output_channels: int = 3,
    p: float = 1.0
)

Convert an input image from grayscale to RGB format.

Parameters

Name	Type	Default	Description
num_output_channels	int	3	The number of channels in the output image. Default: 3.
p	float	1.0	Probability of applying the transform. Default: 1.0.

Notes

- For single-channel (grayscale) images, the channel is replicated to create an RGB image. - If the input is already a 3-channel RGB image, it is returned unchanged. - This transform does not change the data type of the image (e.g., uint8 remains uint8).

ToSepiaclass

Try it on Explore Albumentations

ToSepia(
    p: float = 0.5
)

Apply a sepia filter to the input image. This transform converts a color image to a sepia tone, giving it a warm, brownish tint that is reminiscent of old photographs. The sepia effect is achieved by applying a specific color transformation matrix to the RGB channels of the input image. For grayscale images, the transform is a no-op and returns the original image.

Parameters

Name	Type	Default	Description
p	float	0.5	Probability of applying the transform. Default: 0.5.

Notes

- The sepia effect only works with RGB images (3 channels). For grayscale images, the original image is returned unchanged since the sepia transformation would have no visible effect when R=G=B. - The sepia effect is created using a fixed color transformation matrix: [[0.393, 0.769, 0.189], [0.349, 0.686, 0.168], [0.272, 0.534, 0.131]] - The output image will have the same data type as the input image. - For float32 images, ensure the input values are in the range [0, 1].

UniformParamsclass

UniformParams(
    noise_type: Literal = uniform,
    ranges: Annotated
)

Parameters

Name	Type	Default	Description
noise_type	Literal	uniform	-
ranges	Annotated	-	-

UnsharpMaskclass

Try it on Explore Albumentations

UnsharpMask(
    blur_limit: tuple[int, int] | int = (3, 7),
    sigma_limit: tuple[float, float] | float = 0.0,
    alpha: tuple[float, float] | float = (0.2, 0.5),
    threshold: int = 10,
    p: float = 0.5
)

Sharpen the input image using Unsharp Masking processing and overlays the result with the original image. Unsharp masking is a technique that enhances edge contrast in an image, creating the illusion of increased sharpness. This transform applies Gaussian blur to create a blurred version of the image, then uses this to create a mask which is combined with the original image to enhance edges and fine details.

Parameters

Name	Type	Default	Description
blur_limit	One of: tuple[int, int] int	(3, 7)	maximum Gaussian kernel size for blurring the input image. Must be zero or odd and in range [0, inf). If set to 0 it will be computed from sigma as `round(sigma * (3 if img.dtype == np.uint8 else 4) * 2 + 1) + 1`. If set single value `blur_limit` will be in range (0, blur_limit). Default: (3, 7).
sigma_limit	One of: tuple[float, float] float	0.0	Gaussian kernel standard deviation. Must be more or equal to 0. If set single value `sigma_limit` will be in range (0, sigma_limit). If set to 0 sigma will be computed as `sigma = 0.3((ksize-1)0.5 - 1) + 0.8`. Default: 0.
alpha	One of: tuple[float, float] float	(0.2, 0.5)	range to choose the visibility of the sharpened image. At 0, only the original image is visible, at 1.0 only its sharpened version is visible. Default: (0.2, 0.5).
threshold	int	10	Value to limit sharpening only for areas with high pixel difference between original image and it's smoothed version. Higher threshold means less sharpening on flat areas. Must be in range [0, 255]. Default: 10.
p	float	0.5	probability of applying the transform. Default: 0.5.

Notes

- The algorithm creates a mask M = (I - G) * alpha, where I is the original image and G is the Gaussian blurred version. - The final image is computed as: output = I + M if |I - G| > threshold, else I. - Higher alpha values increase the strength of the sharpening effect. - Higher threshold values limit the sharpening effect to areas with more significant edges or details. - The blur_limit and sigma_limit parameters control the Gaussian blur used to create the mask.

References

Unsharp Masking: https://en.wikipedia.org/wiki/Unsharp_masking

Navigation

albumentations.augmentations.transforms

Members

AdditiveNoiseclass

Parameters

AutoContrastclass

Parameters

Notes

BetaParamsclass

Parameters

CLAHEclass

Parameters

Example

Notes

References

ChannelShuffleclass

Parameters

ChromaticAberrationclass

Parameters

Example

Notes

References

ColorJitterclass

Parameters

Example

Notes

References

Downscaleclass

Parameters

Example

Notes

Embossclass

Parameters

Example

Notes

References

Equalizeclass

Parameters

Example

Notes

References

FancyPCAclass

Parameters

Example

Notes

References

FromFloatclass

Parameters

Example

Notes

GaussNoiseclass

Parameters

Notes

GaussianParamsclass

Parameters

HEStainclass

Parameters

References

HueSaturationValueclass

Parameters

Example

Notes

References

ISONoiseclass

Parameters

Example

Notes

References

Illuminationclass

Parameters

Notes

References

ImageCompressionclass

Parameters

Example

Notes

References

InterpolationPydanticclass

Parameters

InvertImgclass