Stay updated

News & Insights
utils

albumentations.augmentations.pixel.color


Stretch intensity to full range (autocontrast). method: CDF or PIL-style. cutoff, ignore trim extremes. Use for normalizing brightness/contrast across images.

AutoContrastclass

AutoContrast(
    cutoff: float = 0,
    ignore: int | None,
    method: 'cdf' | 'pil' = cdf,
    p: float = 0.5
)

Stretch intensity to full range (autocontrast). method: CDF or PIL-style. cutoff, ignore trim extremes. Use for normalizing brightness/contrast across images. This transform provides two methods for contrast enhancement: 1. CDF method (default): Uses cumulative distribution function for more gradual adjustment 2. PIL method: Uses linear scaling like PIL.ImageOps.autocontrast The transform can optionally exclude extreme values from both ends of the intensity range and preserve specific intensity values (e.g., alpha channel).

Parameters

NameTypeDefaultDescription
cutofffloat0Percentage of pixels to exclude from both ends of the histogram. Range: [0, 100]. Default: 0 (use full intensity range) - 0 means use the minimum and maximum intensity values found - 20 means exclude darkest and brightest 20% of pixels
ignore
One of:
  • int
  • None
-Intensity value to preserve (e.g., alpha channel). Range: [0, 255]. Default: None - If specified, this intensity value will not be modified - Useful for images with alpha channel or special marker values
method
One of:
  • 'cdf'
  • 'pil'
cdfAlgorithm to use for contrast enhancement. Default: "cdf" - "cdf": Uses cumulative distribution for smoother adjustment - "pil": Uses linear scaling like PIL.ImageOps.autocontrast
pfloat0.5Probability of applying the transform. Default: 0.5

Examples

>>> import albumentations as A
>>> # Basic usage
>>> transform = A.AutoContrast(p=1.0)
>>>
>>> # Exclude extreme values
>>> transform = A.AutoContrast(cutoff=20, p=1.0)
>>>
>>> # Preserve alpha channel
>>> transform = A.AutoContrast(ignore=255, p=1.0)
>>>
>>> # Use PIL-like contrast enhancement
>>> transform = A.AutoContrast(method="pil", p=1.0)

Notes

- The transform processes each color channel independently - For grayscale images, only one channel is processed - The output maintains the same dtype as input - Empty or single-color channels remain unchanged

CLAHEclass

CLAHE(
    clip_limit: tuple[float, float] | float = 4.0,
    tile_grid_size: tuple[int, int] = (8, 8),
    p: float = 0.5
)

Contrast Limited Adaptive Histogram Equalization: local contrast with clip_limit and tile_grid_size. Good for non-uniform lighting; preserves detail. CLAHE is an advanced method of improving the contrast in an image. Unlike regular histogram equalization, which operates on the entire image, CLAHE operates on small regions (tiles) in the image. This results in a more balanced equalization, preventing over-amplification of contrast in areas with initially low contrast.

Parameters

NameTypeDefaultDescription
clip_limit
One of:
  • tuple[float, float]
  • float
4.0Controls the contrast enhancement limit. - If a single float is provided, the range will be (1, clip_limit). - If a tuple of two floats is provided, it defines the range for random selection. Higher values allow for more contrast enhancement, but may also increase noise. Default: (1, 4)
tile_grid_sizetuple[int, int](8, 8)Defines the number of tiles in the row and column directions. Format is (rows, columns). Smaller tile sizes can lead to more localized enhancements, while larger sizes give results closer to global histogram equalization. Default: (8, 8)
pfloat0.5Probability of applying the transform. Default: 0.5

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.CLAHE(clip_limit=(1, 4), tile_grid_size=(8, 8), p=1.0)
>>> result = transform(image=image)
>>> clahe_image = result["image"]

Notes

- Supports only RGB or grayscale images. - For color images, CLAHE is applied to the L channel in the LAB color space. - The clip limit determines the maximum slope of the cumulative histogram. A lower clip limit will result in more contrast limiting. - Tile grid size affects the adaptiveness of the method. More tiles increase local adaptiveness but can lead to an unnatural look if set too high.

References

  • [{'description': 'Tutorial', 'source': 'https://docs.opencv.org/master/d5/daf/tutorial_py_histogram_equalization.html'}, {'description': '"Contrast Limited Adaptive Histogram Equalization."', 'source': 'https://ieeexplore.ieee.org/document/109340'}]

ChromaticAberrationclass

ChromaticAberration(
    primary_distortion_limit: tuple[float, float] | float = (-0.02, 0.02),
    secondary_distortion_limit: tuple[float, float] | float = (-0.05, 0.05),
    mode: 'green_purple' | 'red_blue' | 'random' = green_purple,
    interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
    p: float = 0.5
)

Add lateral chromatic aberration: shift red and blue relative to green. distortion_limit and shift_limit control strength. Simulates lens color fringing. Chromatic aberration is an optical effect that occurs when a lens fails to focus all colors to the same point. This transform simulates this effect by applying different radial distortions to the red and blue channels of the image, while leaving the green channel unchanged.

Parameters

NameTypeDefaultDescription
primary_distortion_limit
One of:
  • tuple[float, float]
  • float
(-0.02, 0.02)Range of the primary radial distortion coefficient. If a single float value is provided, the range will be (-primary_distortion_limit, primary_distortion_limit). This parameter controls the distortion in the center of the image: - Positive values result in pincushion distortion (edges bend inward) - Negative values result in barrel distortion (edges bend outward) Default: (-0.02, 0.02).
secondary_distortion_limit
One of:
  • tuple[float, float]
  • float
(-0.05, 0.05)Range of the secondary radial distortion coefficient. If a single float value is provided, the range will be (-secondary_distortion_limit, secondary_distortion_limit). This parameter controls the distortion in the corners of the image: - Positive values enhance pincushion distortion - Negative values enhance barrel distortion Default: (-0.05, 0.05).
mode
One of:
  • 'green_purple'
  • 'red_blue'
  • 'random'
green_purpleType of color fringing to apply. Options are: - 'green_purple': Distorts red and blue channels in opposite directions, creating green-purple fringing. - 'red_blue': Distorts red and blue channels in the same direction, creating red-blue fringing. - 'random': Randomly chooses between 'green_purple' and 'red_blue' modes for each application. Default: 'green_purple'.
interpolation
One of:
  • 0
  • 6
  • 1
  • 2
  • 3
  • 4
  • 5
1Flag specifying the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
pfloat0.5Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5.

Examples

>>> import albumentations as A
>>> import cv2
>>> transform = A.ChromaticAberration(
...     primary_distortion_limit=0.05,
...     secondary_distortion_limit=0.1,
...     mode='green_purple',
...     interpolation=cv2.INTER_LINEAR,
...     p=1.0
... )
>>> transformed = transform(image=image)
>>> aberrated_image = transformed['image']

Notes

- This transform only affects RGB images. Grayscale images will raise an error. - The strength of the effect depends on both primary and secondary distortion limits. - Higher absolute values for distortion limits will result in more pronounced chromatic aberration. - The 'green_purple' mode tends to produce more noticeable effects than 'red_blue'.

References

  • [{'description': 'Chromatic Aberration', 'source': 'https://en.wikipedia.org/wiki/Chromatic_aberration'}]

ColorJitterclass

ColorJitter(
    brightness: tuple[float, float] | float = (0.8, 1.2),
    contrast: tuple[float, float] | float = (0.8, 1.2),
    saturation: tuple[float, float] | float = (0.8, 1.2),
    hue: tuple[float, float] | float = (-0.5, 0.5),
    p: float = 0.5
)

Randomly apply brightness, contrast, saturation, hue in random order. Separate ranges per effect. Strong color augmentation for classification and detection. This transform is similar to torchvision's ColorJitter but with some differences due to the use of OpenCV instead of Pillow. The main differences are: 1. OpenCV and Pillow use different formulas to convert images to HSV format. 2. This implementation uses value saturation instead of uint8 overflow as in Pillow. These differences may result in slightly different output compared to torchvision's ColorJitter.

Parameters

NameTypeDefaultDescription
brightness
One of:
  • tuple[float, float]
  • float
(0.8, 1.2)How much to jitter brightness. If float: The brightness factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness]. If tuple: The brightness factor is sampled from the range specified. Should be non-negative numbers. Default: (0.8, 1.2)
contrast
One of:
  • tuple[float, float]
  • float
(0.8, 1.2)How much to jitter contrast. If float: The contrast factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast]. If tuple: The contrast factor is sampled from the range specified. Should be non-negative numbers. Default: (0.8, 1.2)
saturation
One of:
  • tuple[float, float]
  • float
(0.8, 1.2)How much to jitter saturation. If float: The saturation factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation]. If tuple: The saturation factor is sampled from the range specified. Should be non-negative numbers. Default: (0.8, 1.2)
hue
One of:
  • tuple[float, float]
  • float
(-0.5, 0.5)How much to jitter hue. If float: The hue factor is chosen uniformly from [-hue, hue]. Should have 0 <= hue <= 0.5. If tuple: The hue factor is sampled from the range specified. Values should be in range [-0.5, 0.5]. Default: (-0.5, 0.5) p (float): Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5
pfloat0.5-

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1, p=1.0)
>>> result = transform(image=image)
>>> jittered_image = result['image']

Notes

- The order of application for these color transformations is random for each image. - The ranges for brightness, contrast, and saturation are applied as multiplicative factors. - The range for hue is applied as an additive factor.

References

  • [{'description': 'ColorJitter', 'source': 'https://pytorch.org/vision/stable/generated/torchvision.transforms.ColorJitter.html'}, {'description': 'Color Conversions', 'source': 'https://docs.opencv.org/3.4/de/d25/imgproc_color_conversions.html'}]

Equalizeclass

Equalize(
    mode: 'cv' | 'pil' = cv,
    by_channels: bool = True,
    mask: ndarray | Callable[..., Any] | None,
    mask_params: Sequence = (),
    p: float = 0.5
)

Equalize histogram to spread intensities. mode: global or adaptive; mask optional. Improves contrast normalization across datasets. This transform applies histogram equalization to the input image. Histogram equalization is a method in image processing of contrast adjustment using the image's histogram.

Parameters

NameTypeDefaultDescription
mode
One of:
  • 'cv'
  • 'pil'
cvUse OpenCV or Pillow equalization method. Default: 'cv'
by_channelsboolTrueIf True, use equalization by channels separately, else convert image to YCbCr representation and use equalization by `Y` channel. Default: True
mask
One of:
  • ndarray
  • Callable[..., Any]
  • None
-If given, only the pixels selected by the mask are included in the analysis. Can be: - A 1-channel or 3-channel numpy array of the same size as the input image. - A callable (function) that generates a mask. The function should accept 'image' as its first argument, and can accept additional arguments specified in mask_params. Default: None
mask_paramsSequence()Additional parameters to pass to the mask function. These parameters will be taken from the data dict passed to __call__. Default: ()
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>>
>>> # Using a static mask
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> transform = A.Equalize(mask=mask, p=1.0)
>>> result = transform(image=image)
>>>
>>> # Using a dynamic mask function
>>> def mask_func(image, bboxes):
...     mask = np.ones_like(image[:, :, 0], dtype=np.uint8)
...     for bbox in bboxes:
...         x1, y1, x2, y2 = map(int, bbox)
...         mask[y1:y2, x1:x2] = 0  # Exclude areas inside bounding boxes
...     return mask
>>>
>>> transform = A.Equalize(mask=mask_func, mask_params=['bboxes'], p=1.0)
>>> bboxes = [(10, 10, 50, 50), (60, 60, 90, 90)]  # Example bounding boxes
>>> result = transform(image=image, bboxes=bboxes)

Notes

- When mode='cv', OpenCV's equalizeHist() function is used. - When mode='pil', Pillow's equalize() function is used. - The 'by_channels' parameter determines whether equalization is applied to each color channel independently (True) or to the luminance channel only (False). - If a mask is provided as a numpy array, it should have the same height and width as the input image. - If a mask is provided as a function, it allows for dynamic mask generation based on the input image and additional parameters. This is useful for scenarios where the mask depends on the image content or external data (e.g., bounding boxes, segmentation masks).

References

  • [{'description': 'OpenCV equalizeHist', 'source': 'https://docs.opencv.org/3.4/d6/dc7/group__imgproc__hist.html#ga7e54091f0c937d49bf84152a16f76d6e'}, {'description': 'Pillow ImageOps.equalize', 'source': 'https://pillow.readthedocs.io/en/stable/reference/ImageOps.html#PIL.ImageOps.equalize'}, {'description': 'Histogram Equalization', 'source': 'https://en.wikipedia.org/wiki/Histogram_equalization'}]

FancyPCAclass

FancyPCA(
    alpha: float = 0.1,
    p: float = 0.5
)

Add color variation via PCA on RGB: perturb components by alpha_std. Simulates natural lighting variation (ImageNet-style). Good for object recognition. This augmentation technique applies PCA (Principal Component Analysis) to the image's color channels, then adds multiples of the principal components to the image, with magnitudes proportional to the corresponding eigenvalues times a random variable drawn from a Gaussian with mean 0 and standard deviation 'alpha'.

Parameters

NameTypeDefaultDescription
alphafloat0.1Standard deviation of the Gaussian distribution used to generate random noise for each principal component. Default: 0.1.
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.FancyPCA(alpha=0.1, p=1.0)
>>> result = transform(image=image)
>>> augmented_image = result["image"]

Notes

- This augmentation is particularly effective for RGB images but can work with any number of channels. - For grayscale images, it applies a simplified version of the augmentation. - The transform preserves the mean of the image while adjusting the color/intensity variation. - This implementation is based on the paper by Krizhevsky et al. and is similar to the one used in the original AlexNet paper.

References

  • [{'description': 'ImageNet Classification with Deep Convolutional Neural Networks', 'source': 'In Advances in Neural Information'}]

HEStainclass

HEStain(
    method: 'preset' | 'random_preset' | 'vahadane' | 'macenko' = random_preset,
    preset: 'ruifrok' | 'macenko' | 'standard' | 'high_contrast' | 'h_heavy' | 'e_heavy' | 'dark' | 'light' | None,
    intensity_scale_range: tuple[float, float] = (0.7, 1.3),
    intensity_shift_range: tuple[float, float] = (-0.2, 0.2),
    augment_background: bool = False,
    p: float = 0.5
)

H&E stain augmentation for histopathology. method: preset, random_preset, vahadane, macenko. Simulates staining variation for robust pathology models. This transform simulates different H&E staining conditions using either: 1. Predefined stain matrices (8 standard references) 2. Vahadane method for stain extraction 3. Macenko method for stain extraction 4. Custom stain matrices

Parameters

NameTypeDefaultDescription
method
One of:
  • 'preset'
  • 'random_preset'
  • 'vahadane'
  • 'macenko'
random_presetMethod to use for stain augmentation: - "preset": Use predefined stain matrices - "random_preset": Randomly select a preset matrix each time - "vahadane": Extract using Vahadane method - "macenko": Extract using Macenko method Default: "preset"
preset
One of:
  • 'ruifrok'
  • 'macenko'
  • 'standard'
  • 'high_contrast'
  • 'h_heavy'
  • 'e_heavy'
  • 'dark'
  • 'light'
  • None
-Preset stain matrix to use when method="preset": - "ruifrok": Standard reference from Ruifrok & Johnston - "macenko": Reference from Macenko's method - "standard": Typical bright-field microscopy - "high_contrast": Enhanced contrast - "h_heavy": Hematoxylin dominant - "e_heavy": Eosin dominant - "dark": Darker staining - "light": Lighter staining Default: "standard"
intensity_scale_rangetuple[float, float](0.7, 1.3)Range for multiplicative stain intensity variation. Values are multipliers between 0.5 and 1.5. For example: - (0.7, 1.3) means stain intensities will vary from 70% to 130% - (0.9, 1.1) gives subtle variations - (0.5, 1.5) gives dramatic variations Default: (0.7, 1.3)
intensity_shift_rangetuple[float, float](-0.2, 0.2)Range for additive stain intensity variation. Values between -0.3 and 0.3. For example: - (-0.2, 0.2) means intensities will be shifted by -20% to +20% - (-0.1, 0.1) gives subtle shifts - (-0.3, 0.3) gives dramatic shifts Default: (-0.2, 0.2)
augment_backgroundboolFalseWhether to apply augmentation to background regions. Default: False
pfloat0.5-

Examples

>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Create a sample H&E stained histopathology image
>>> # For real use cases, load an actual H&E stained image
>>> image = np.zeros((300, 300, 3), dtype=np.uint8)
>>> # Simulate tissue regions with different staining patterns
>>> image[50:150, 50:150] = np.array([120, 140, 180], dtype=np.uint8)  # Hematoxylin-rich region
>>> image[150:250, 150:250] = np.array([140, 160, 120], dtype=np.uint8)  # Eosin-rich region
>>>
>>> # Example 1: Using a specific preset stain matrix
>>> transform = A.HEStain(
...     method="preset",
...     preset="standard",
...     intensity_scale_range=(0.8, 1.2),
...     intensity_shift_range=(-0.1, 0.1),
...     augment_background=False,
...     p=1.0
... )
>>> result = transform(image=image)
>>> transformed_image = result['image']
>>>
>>> # Example 2: Using random preset selection
>>> transform = A.HEStain(
...     method="random_preset",
...     intensity_scale_range=(0.7, 1.3),
...     intensity_shift_range=(-0.15, 0.15),
...     p=1.0
... )
>>> result = transform(image=image)
>>> transformed_image = result['image']
>>>
>>> # Example 3: Using Vahadane method (requires H&E stained input)
>>> transform = A.HEStain(
...     method="vahadane",
...     intensity_scale_range=(0.7, 1.3),
...     p=1.0
... )
>>> result = transform(image=image)
>>> transformed_image = result['image']
>>>
>>> # Example 4: Using Macenko method (requires H&E stained input)
>>> transform = A.HEStain(
...     method="macenko",
...     intensity_scale_range=(0.7, 1.3),
...     intensity_shift_range=(-0.2, 0.2),
...     p=1.0
... )
>>> result = transform(image=image)
>>> transformed_image = result['image']
>>>
>>> # Example 5: Combining with other transforms in a pipeline
>>> transform = A.Compose([
...     A.HEStain(method="preset", preset="high_contrast", p=1.0),
...     A.RandomBrightnessContrast(p=0.5),
... ])
>>> result = transform(image=image)
>>> transformed_image = result['image']

References

  • [{'description': 'A. C. Ruifrok and D. A. Johnston, "Quantification of histochemical"', 'source': 'Analytical and quantitative cytology and histology, 2001.'}, {'description': 'M. Macenko et al., "A method for normalizing histology slides for', 'source': '2009 IEEE International Symposium on quantitative analysis," 2009 IEEE International Symposium on Biomedical Imaging, 2009.'}]

HueSaturationValueclass

HueSaturationValue(
    hue_shift_limit: tuple[float, float] | float = (-20, 20),
    sat_shift_limit: tuple[float, float] | float = (-30, 30),
    val_shift_limit: tuple[float, float] | float = (-20, 20),
    p: float = 0.5
)

Randomly shift hue, saturation, and value (HSV). Separate ranges per channel. Common for color augmentation in classification. This transform adjusts the HSV (Hue, Saturation, Value) channels of an input RGB image. It allows for independent control over each channel, providing a wide range of color and brightness modifications.

Parameters

NameTypeDefaultDescription
hue_shift_limit
One of:
  • tuple[float, float]
  • float
(-20, 20)Range for changing hue. If a single float value is provided, the range will be (-hue_shift_limit, hue_shift_limit). Values should be in the range [-180, 180]. Default: (-20, 20).
sat_shift_limit
One of:
  • tuple[float, float]
  • float
(-30, 30)Range for changing saturation. If a single float value is provided, the range will be (-sat_shift_limit, sat_shift_limit). Values should be in the range [-255, 255]. Default: (-30, 30).
val_shift_limit
One of:
  • tuple[float, float]
  • float
(-20, 20)Range for changing value (brightness). If a single float value is provided, the range will be (-val_shift_limit, val_shift_limit). Values should be in the range [-255, 255]. Default: (-20, 20).
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.HueSaturationValue(
...     hue_shift_limit=20,
...     sat_shift_limit=30,
...     val_shift_limit=20,
...     p=0.7
... )
>>> result = transform(image=image)
>>> augmented_image = result["image"]

Notes

- The transform first converts the input RGB image to the HSV color space. - Each channel (Hue, Saturation, Value) is adjusted independently. - Hue is circular, so it wraps around at 180 degrees. - For float32 images, the shift values are applied as percentages of the full range. - This transform is particularly useful for color augmentation and simulating different lighting conditions.

References

  • [{'description': 'HSV color space', 'source': 'https://en.wikipedia.org/wiki/HSL_and_HSV'}]

Illuminationclass

Illumination(
    mode: 'linear' | 'corner' | 'gaussian' = linear,
    intensity_range: tuple[float, float] = (0.01, 0.2),
    effect_type: 'brighten' | 'darken' | 'both' = both,
    angle_range: tuple[float, float] = (0, 360),
    center_range: tuple[float, float] = (0.1, 0.9),
    sigma_range: tuple[float, float] = (0.2, 1.0),
    p: float = 0.5
)

Illumination patterns: directional (linear), corner shadows/highlights, or gaussian. mode and params control shape and strength. Simulates lighting variation. This transform simulates different lighting conditions by applying controlled illumination patterns. It can create effects like: - Directional lighting (linear mode) - Corner shadows/highlights (corner mode) - Spotlights or local lighting (gaussian mode) These effects can be used to: - Simulate natural lighting variations - Add dramatic lighting effects - Create synthetic shadows or highlights - Augment training data with different lighting conditions

Parameters

NameTypeDefaultDescription
mode
One of:
  • 'linear'
  • 'corner'
  • 'gaussian'
linearType of illumination pattern: - 'linear': Creates a smooth gradient across the image, simulating directional lighting like sunlight through a window - 'corner': Applies gradient from any corner, simulating light source from a corner - 'gaussian': Creates a circular spotlight effect, simulating local light sources Default: 'linear'
intensity_rangetuple[float, float](0.01, 0.2)Range for effect strength. Values between 0.01 and 0.2: - 0.01-0.05: Subtle lighting changes - 0.05-0.1: Moderate lighting effects - 0.1-0.2: Strong lighting effects Default: (0.01, 0.2)
effect_type
One of:
  • 'brighten'
  • 'darken'
  • 'both'
bothType of lighting change: - 'brighten': Only adds light (like a spotlight) - 'darken': Only removes light (like a shadow) - 'both': Randomly chooses between brightening and darkening Default: 'both'
angle_rangetuple[float, float](0, 360)Range for gradient angle in degrees. Controls direction of linear gradient: - 0°: Left to right - 90°: Top to bottom - 180°: Right to left - 270°: Bottom to top Only used for 'linear' mode. Default: (0, 360)
center_rangetuple[float, float](0.1, 0.9)Range for spotlight position. Values between 0 and 1 representing relative position: - (0, 0): Top-left corner - (1, 1): Bottom-right corner - (0.5, 0.5): Center of image Only used for 'gaussian' mode. Default: (0.1, 0.9)
sigma_rangetuple[float, float](0.2, 1.0)Range for spotlight size. Values between 0.2 and 1.0: - 0.2: Small, focused spotlight - 0.5: Medium-sized light area - 1.0: Broad, soft lighting Only used for 'gaussian' mode. Default: (0.2, 1.0)
pfloat0.5Probability of applying the transform. Default: 0.5

Examples

>>> import albumentations as A
>>> # Simulate sunlight through window
>>> transform = A.Illumination(
...     mode='linear',
...     intensity_range=(0.05, 0.1),
...     effect_type='brighten',
...     angle_range=(30, 60)
... )
>>>
>>> # Create dramatic corner shadow
>>> transform = A.Illumination(
...     mode='corner',
...     intensity_range=(0.1, 0.2),
...     effect_type='darken'
... )
>>>
>>> # Add multiple spotlights
>>> transform1 = A.Illumination(
...     mode='gaussian',
...     intensity_range=(0.05, 0.15),
...     effect_type='brighten',
...     center_range=(0.2, 0.4),
...     sigma_range=(0.2, 0.3)
... )
>>> transform2 = A.Illumination(
...     mode='gaussian',
...     intensity_range=(0.05, 0.15),
...     effect_type='darken',
...     center_range=(0.6, 0.8),
...     sigma_range=(0.3, 0.5)
... )
>>> transforms = A.Compose([transform1, transform2])

Notes

- The transform preserves image range and dtype - Effects are applied multiplicatively to preserve texture - Can be combined with other transforms for complex lighting scenarios - Useful for training models to be robust to lighting variations

References

  • [{'description': 'Lighting in Computer Vision', 'source': 'https://en.wikipedia.org/wiki/Lighting_in_computer_vision'}, {'description': 'Image-based lighting', 'source': 'https://en.wikipedia.org/wiki/Image-based_lighting'}, {'description': 'Similar implementation in Kornia', 'source': 'https://kornia.readthedocs.io/en/latest/augmentation.html#randomlinearillumination'}, {'description': 'Research on lighting augmentation', 'source': '"Learning Deep Representations of Fine-grained Visual Descriptions" https://arxiv.org/abs/1605.05395'}, {'description': 'Photography lighting patterns', 'source': 'https://en.wikipedia.org/wiki/Lighting_pattern'}]

PhotoMetricDistortclass

PhotoMetricDistort(
    brightness_range: tuple[float, float] = (0.875, 1.125),
    contrast_range: tuple[float, float] = (0.5, 1.5),
    saturation_range: tuple[float, float] = (0.5, 1.5),
    hue_range: tuple[float, float] = (-0.05, 0.05),
    distort_p: float = 0.5,
    p: float = 0.5
)

SSD-style photometric distortion: brightness, contrast, saturation, hue, channel shuffle; each with probability distort_p. For detection training. Applies brightness, contrast, saturation, and hue adjustments independently with probability `distort_p` each. Contrast is applied either before or after the HSV-space adjustments (randomly chosen). Optionally permutes channels with probability `distort_p`. This mirrors the `RandomPhotometricDistort` transform from torchvision but uses our existing `adjust_*_torchvision` functional primitives.

Parameters

NameTypeDefaultDescription
brightness_rangetuple[float, float](0.875, 1.125)Multiplicative factor range for brightness. Factor is drawn uniformly from this range. Must be non-negative. Default: `(0.875, 1.125)`.
contrast_rangetuple[float, float](0.5, 1.5)Multiplicative factor range for contrast. Factor is drawn uniformly from this range. Must be non-negative. Default: `(0.5, 1.5)`.
saturation_rangetuple[float, float](0.5, 1.5)Multiplicative factor range for saturation. Factor is drawn uniformly from this range. Must be non-negative. Default: `(0.5, 1.5)`.
hue_rangetuple[float, float](-0.05, 0.05)Additive factor range for hue. Factor is drawn uniformly from this range. Must be in `[-0.5, 0.5]`. Default: `(-0.05, 0.05)`.
distort_pfloat0.5Probability of applying each individual distortion (brightness, contrast, saturation, hue, channel permutation). Default: `0.5`.
pfloat0.5Probability of applying the overall transform. Default: `0.5`.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> bboxes = np.array([[10, 10, 50, 50]], dtype=np.float32)
>>> bbox_labels = [1]
>>> keypoints = np.array([[20, 30]], dtype=np.float32)
>>> keypoint_labels = [0]
>>>
>>> transform = A.Compose([
...     A.PhotoMetricDistort(
...         brightness_range=(0.875, 1.125),
...         contrast_range=(0.5, 1.5),
...         saturation_range=(0.5, 1.5),
...         hue_range=(-0.05, 0.05),
...         distort_p=0.5,
...         p=1.0,
...     )
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
...    keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> result = transform(
...     image=image,
...     mask=mask,
...     bboxes=bboxes,
...     bbox_labels=bbox_labels,
...     keypoints=keypoints,
...     keypoint_labels=keypoint_labels,
... )
>>> transformed_image = result['image']

Notes

- Each of the five distortions (brightness, contrast, saturation, hue, channel shuffle) is applied independently with probability `distort_p`. - Contrast is randomly applied either before or after saturation/hue adjustment. - For single-channel images, saturation and hue adjustments have no effect.

References

  • [{'description': 'SSD', 'source': 'https://arxiv.org/abs/1512.02325'}, {'description': 'torchvision RandomPhotometricDistort', 'source': 'https://pytorch.org/vision/stable/generated/torchvision.transforms.v2.RandomPhotometricDistort.html'}]

PlanckianJitterclass

PlanckianJitter(
    mode: 'blackbody' | 'cied' = blackbody,
    temperature_limit: tuple[int, int] | None,
    sampling_method: 'uniform' | 'gaussian' = uniform,
    p: float = 0.5
)

Simulate color temperature variation via Planckian locus jitter. mode and magnitude control the shift. Good for robustness to different light sources. This transform adjusts the color of an image to mimic the effect of different color temperatures of light sources, based on Planck's law of black body radiation. It can simulate the appearance of an image under various lighting conditions, from warm (reddish) to cool (bluish) color casts. PlanckianJitter vs. ColorJitter: PlanckianJitter is fundamentally different from ColorJitter in its approach and use cases: 1. Physics-based: PlanckianJitter is grounded in the physics of light, simulating real-world color temperature changes. ColorJitter applies arbitrary color adjustments. 2. Natural effects: This transform produces color shifts that correspond to natural lighting variations, making it ideal for outdoor scene simulation or color constancy problems. 3. Single parameter: Color changes are controlled by a single, physically meaningful parameter (color temperature), unlike ColorJitter's multiple abstract parameters. 4. Correlated changes: Color shifts are correlated across channels in a way that mimics natural light, whereas ColorJitter can make independent channel adjustments. When to use PlanckianJitter: - Simulating different times of day or lighting conditions in outdoor scenes - Augmenting data for computer vision tasks that need to be robust to natural lighting changes - Preparing synthetic data to better match real-world lighting variations - Color constancy research or applications - When you need physically plausible color variations rather than arbitrary color changes The logic behind PlanckianJitter: As the color temperature increases: 1. Lower temperatures (around 3000K) produce warm, reddish tones, simulating sunset or incandescent lighting. 2. Mid-range temperatures (around 5500K) correspond to daylight. 3. Higher temperatures (above 7000K) result in cool, bluish tones, similar to overcast sky or shade. This progression mimics the natural variation of sunlight throughout the day and in different weather conditions.

Parameters

NameTypeDefaultDescription
mode
One of:
  • 'blackbody'
  • 'cied'
blackbodyThe mode of the transformation. - "blackbody": Simulates blackbody radiation color changes. - "cied": Uses the CIE D illuminant series for color temperature simulation. Default: "blackbody"
temperature_limit
One of:
  • tuple[int, int]
  • None
-The range of color temperatures (in Kelvin) to sample from. - For "blackbody" mode: Should be within [3000K, 15000K]. Default: (3000, 15000) - For "cied" mode: Should be within [4000K, 15000K]. Default: (4000, 15000) If None, the default ranges will be used based on the selected mode. Higher temperatures produce cooler (bluish) images, lower temperatures produce warmer (reddish) images.
sampling_method
One of:
  • 'uniform'
  • 'gaussian'
uniformMethod to sample the temperature. - "uniform": Samples uniformly across the specified range. - "gaussian": Samples from a Gaussian distribution centered at 6500K (approximate daylight). Default: "uniform"
pfloat0.5Probability of applying the transform. Default: 0.5

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)
>>> transform = A.PlanckianJitter(mode="blackbody",
...                               temperature_range=(3000, 9000),
...                               sampling_method="uniform",
...                               p=1.0)
>>> result = transform(image=image)
>>> jittered_image = result["image"]

Notes

- The transform preserves the overall brightness of the image while shifting its color. - The "blackbody" mode provides a wider range of color shifts, especially in the lower (warmer) temperatures. - The "cied" mode is based on standard illuminants and may provide more realistic daylight variations. - The Gaussian sampling method tends to produce more subtle variations, as it's centered around daylight. - Unlike ColorJitter, this transform ensures that color changes are physically plausible and correlated across channels, maintaining the natural appearance of the scene under different lighting conditions.

References

  • [{'description': "Planck's law", 'source': 'https://en.wikipedia.org/wiki/Planck%27s_law'}, {'description': 'CIE Standard Illuminants', 'source': 'https://en.wikipedia.org/wiki/Standard_illuminant'}, {'description': 'Color temperature', 'source': 'https://en.wikipedia.org/wiki/Color_temperature'}, {'description': 'Implementation inspired by', 'source': 'https://github.com/TheZino/PlanckianJitter'}]

PlasmaBrightnessContrastclass

PlasmaBrightnessContrast(
    brightness_range: tuple[float, float] = (-0.3, 0.3),
    contrast_range: tuple[float, float] = (-0.3, 0.3),
    plasma_size: int = 256,
    roughness: float = 3.0,
    p: float = 0.5
)

Plasma fractal (Diamond-Square) pattern varies brightness and contrast spatially. brightness_range, contrast_range. Organic, non-uniform look. Uses Diamond-Square algorithm to generate organic-looking fractal patterns that create spatially-varying brightness and contrast adjustments.

Parameters

NameTypeDefaultDescription
brightness_rangetuple[float, float](-0.3, 0.3)Range for brightness adjustment strength. Values between -1 and 1: - Positive values increase brightness - Negative values decrease brightness - 0 means no brightness change Default: (-0.3, 0.3)
contrast_rangetuple[float, float](-0.3, 0.3)Range for contrast adjustment strength. Values between -1 and 1: - Positive values increase contrast - Negative values decrease contrast - 0 means no contrast change Default: (-0.3, 0.3)
plasma_sizeint256Size of the initial plasma pattern grid. Larger values create more detailed patterns but are slower to compute. The pattern will be resized to match the input image dimensions. Default: 256
roughnessfloat3.0Controls how quickly the noise amplitude increases at each iteration. Must be greater than 0: - Low values (< 1.0): Smoother, more gradual pattern - Medium values (~2.0): Natural-looking pattern - High values (> 3.0): Very rough, noisy pattern Default: 3.0
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import albumentations as A
>>> import numpy as np

# Default parameters
>>> transform = A.PlasmaBrightnessContrast(p=1.0)

# Custom adjustments
>>> transform = A.PlasmaBrightnessContrast(
...     brightness_range=(-0.5, 0.5),
...     contrast_range=(-0.3, 0.3),
...     plasma_size=512,    # More detailed pattern
...     roughness=0.7,      # Smoother transitions
...     p=1.0
... )

Notes

- Works with any number of channels (grayscale, RGB, multispectral) - The same plasma pattern is applied to all channels - Operations are performed in float32 precision - Final values are clipped to valid range [0, max_value]

References

  • [{'description': 'Fournier, Fussell, and Carpenter, "Computer rendering of stochastic models,"', 'source': 'Communications of the ACM, 1982. Paper introducing the Diamond-Square algorithm.'}, {'description': 'Diamond-Square algorithm', 'source': 'https://en.wikipedia.org/wiki/Diamond-square_algorithm'}]

PlasmaShadowclass

PlasmaShadow(
    shadow_intensity_range: tuple[float, float] = (0.3, 0.7),
    plasma_size: int = 256,
    roughness: float = 3.0,
    p: float = 0.5
)

Plasma fractal (Diamond-Square) shadow: organic darkening. shadow_intensity_range, roughness. Good for natural shading and lighting variation. Creates organic-looking shadows using plasma fractal noise pattern. The shadow intensity varies smoothly across the image, creating natural-looking darkening effects that can simulate shadows, shading, or lighting variations.

Parameters

NameTypeDefaultDescription
shadow_intensity_rangetuple[float, float](0.3, 0.7)Range for shadow intensity. Values between 0 and 1: - 0 means no shadow (original image) - 1 means maximum darkening (black) - Values between create partial shadows Default: (0.3, 0.7)
plasma_sizeint256-
roughnessfloat3.0Controls how quickly the noise amplitude increases at each iteration. Must be greater than 0: - Low values (< 1.0): Smoother, more gradual shadows - Medium values (~2.0): Natural-looking shadows - High values (> 3.0): Very rough, noisy shadows Default: 3.0
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import albumentations as A
>>> import numpy as np

# Default parameters for natural shadows
>>> transform = A.PlasmaShadow(p=1.0)

# Subtle, smooth shadows
>>> transform = A.PlasmaShadow(
...     shadow_intensity_range=(0.1, 0.3),
...     roughness=0.7,
...     p=1.0
... )

# Dramatic, detailed shadows
>>> transform = A.PlasmaShadow(
...     shadow_intensity_range=(0.5, 0.9),
...     roughness=0.3,
...     p=1.0
... )

Notes

- The transform darkens the image using a plasma pattern - Works with any number of channels (grayscale, RGB, multispectral) - Shadow pattern is generated using Diamond-Square algorithm with specific kernels - The same shadow pattern is applied to all channels - Final values are clipped to valid range [0, max_value]

References

  • [{'description': 'Fournier, Fussell, and Carpenter, "Computer rendering of stochastic models,"', 'source': 'Communications of the ACM, 1982. Paper introducing the Diamond-Square algorithm.'}, {'description': 'Diamond-Square algorithm', 'source': 'https://en.wikipedia.org/wiki/Diamond-square_algorithm'}]

Posterizeclass

Posterize(
    num_bits: int | tuple[int, int] | list[tuple[int, int]] = 4,
    p: float = 0.5
)

Reduce bits per color channel (e.g. 8→4). num_bits_range controls strength; lower gives stronger posterization. Simulates low-bit-depth or compression. This transform applies color posterization, a technique that reduces the number of distinct colors used in an image. It works by lowering the number of bits used to represent each color channel, effectively creating a "poster-like" effect with fewer color gradations.

Parameters

NameTypeDefaultDescription
num_bits
One of:
  • int
  • tuple[int, int]
  • list[tuple[int, int]]
4Defines the number of bits to keep for each color channel. Can be specified in several ways: - Single int: Same number of bits for all channels. Range: [1, 7]. - tuple of two ints: (min_bits, max_bits) to randomly choose from. Range for each: [1, 7]. - list of three ints: Specific number of bits for each channel [r_bits, g_bits, b_bits]. - list of three tuples: Ranges for each channel [(r_min, r_max), (g_min, g_max), (b_min, b_max)]. Default: 4
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)

# Posterize all channels to 3 bits
>>> transform = A.Posterize(num_bits=3, p=1.0)
>>> posterized_image = transform(image=image)["image"]

# Randomly posterize between 2 and 5 bits
>>> transform = A.Posterize(num_bits=(2, 5), p=1.0)
>>> posterized_image = transform(image=image)["image"]

# Different bits for each channel
>>> transform = A.Posterize(num_bits=[3, 5, 2], p=1.0)
>>> posterized_image = transform(image=image)["image"]

# Range of bits for each channel
>>> transform = A.Posterize(num_bits=[(1, 3), (3, 5), (2, 4)], p=1.0)
>>> posterized_image = transform(image=image)["image"]

Notes

- The effect becomes more pronounced as the number of bits is reduced. - This transform can create interesting artistic effects or be used for image compression simulation. - Posterization is particularly useful for: * Creating stylized or retro-looking images * Reducing the color palette for specific artistic effects * Simulating the look of older or lower-quality digital images * Data augmentation in scenarios where color depth might vary

References

  • [{'description': 'Color Quantization', 'source': 'https://en.wikipedia.org/wiki/Color_quantization'}, {'description': 'Posterization', 'source': 'https://en.wikipedia.org/wiki/Posterization'}]

RGBShiftclass

RGBShift(
    r_shift_limit: tuple[float, float] | float = (-20, 20),
    g_shift_limit: tuple[float, float] | float = (-20, 20),
    b_shift_limit: tuple[float, float] | float = (-20, 20),
    p: float = 0.5
)

Shift R, G, B with separate ranges. Specialized AdditiveNoise with constant uniform shifts. Params: r_shift_limit, g_shift_limit, b_shift_limit. A specialized version of AdditiveNoise that applies constant uniform shifts to RGB channels. Each channel (R,G,B) can have its own shift range specified.

Parameters

NameTypeDefaultDescription
r_shift_limit
One of:
  • tuple[float, float]
  • float
(-20, 20)Range for shifting the red channel. Options: - If tuple (min, max): Sample shift value from this range - If int: Sample shift value from (-r_shift_limit, r_shift_limit) - For uint8 images: Values represent absolute shifts in [0, 255] - For float images: Values represent relative shifts in [0, 1] Default: (-20, 20)
g_shift_limit
One of:
  • tuple[float, float]
  • float
(-20, 20)Range for shifting the green channel. Options: - If tuple (min, max): Sample shift value from this range - If int: Sample shift value from (-g_shift_limit, g_shift_limit) - For uint8 images: Values represent absolute shifts in [0, 255] - For float images: Values represent relative shifts in [0, 1] Default: (-20, 20)
b_shift_limit
One of:
  • tuple[float, float]
  • float
(-20, 20)Range for shifting the blue channel. Options: - If tuple (min, max): Sample shift value from this range - If int: Sample shift value from (-b_shift_limit, b_shift_limit) - For uint8 images: Values represent absolute shifts in [0, 255] - For float images: Values represent relative shifts in [0, 1] Default: (-20, 20)
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A

# Shift RGB channels of uint8 image
>>> transform = A.RGBShift(
...     r_shift_limit=30,  # Will sample red shift from [-30, 30]
...     g_shift_limit=(-20, 20),  # Will sample green shift from [-20, 20]
...     b_shift_limit=(-10, 10),  # Will sample blue shift from [-10, 10]
...     p=1.0
... )
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> shifted = transform(image=image)["image"]

# Same effect using AdditiveNoise
>>> transform = A.AdditiveNoise(
...     noise_type="uniform",
...     spatial_mode="constant",  # One value per channel
...     noise_params={
...         "ranges": [(-30/255, 30/255), (-20/255, 20/255), (-10/255, 10/255)]
...     },
...     p=1.0
... )

Notes

- Values are shifted independently for each channel - For uint8 images: * Input ranges like (-20, 20) represent pixel value shifts * A shift of 20 means adding 20 to that channel * Final values are clipped to [0, 255] - For float32 images: * Input ranges like (-0.1, 0.1) represent relative shifts * A shift of 0.1 means adding 0.1 to that channel * Final values are clipped to [0, 1]

RandomBrightnessContrastclass

RandomBrightnessContrast(
    brightness_limit: tuple[float, float] | float = (-0.2, 0.2),
    contrast_limit: tuple[float, float] | float = (-0.2, 0.2),
    brightness_by_max: bool = True,
    ensure_safe_range: bool = False,
    p: float = 0.5
)

Randomly adjust brightness and contrast with separate ranges. Simple and fast; good baseline color augmentation for classification and detection. This transform adjusts the brightness and contrast of an image simultaneously, allowing for a wide range of lighting and contrast variations. It's particularly useful for data augmentation in computer vision tasks, helping models become more robust to different lighting conditions.

Parameters

NameTypeDefaultDescription
brightness_limit
One of:
  • tuple[float, float]
  • float
(-0.2, 0.2)Factor range for changing brightness. If a single float value is provided, the range will be (-brightness_limit, brightness_limit). Values should typically be in the range [-1.0, 1.0], where 0 means no change, 1.0 means maximum brightness, and -1.0 means minimum brightness. Default: (-0.2, 0.2).
contrast_limit
One of:
  • tuple[float, float]
  • float
(-0.2, 0.2)Factor range for changing contrast. If a single float value is provided, the range will be (-contrast_limit, contrast_limit). Values should typically be in the range [-1.0, 1.0], where 0 means no change, 1.0 means maximum increase in contrast, and -1.0 means maximum decrease in contrast. Default: (-0.2, 0.2).
brightness_by_maxboolTrueIf True, adjusts brightness by scaling pixel values up to the maximum value of the image's dtype. If False, uses the mean pixel value for adjustment. Default: True.
ensure_safe_rangeboolFalseIf True, adjusts alpha and beta to prevent overflow/underflow. This ensures output values stay within the valid range for the image dtype without clipping. Default: False.
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)

# Default usage
>>> transform = A.RandomBrightnessContrast(p=1.0)
>>> augmented_image = transform(image=image)["image"]

# Custom brightness and contrast limits
>>> transform = A.RandomBrightnessContrast(
...     brightness_limit=0.3,
...     contrast_limit=0.3,
...     p=1.0
... )
>>> augmented_image = transform(image=image)["image"]

# Adjust brightness based on mean value
>>> transform = A.RandomBrightnessContrast(
...     brightness_limit=0.2,
...     contrast_limit=0.2,
...     brightness_by_max=False,
...     p=1.0
... )
>>> augmented_image = transform(image=image)["image"]

Notes

- The order of operation is: contrast adjustment, then brightness adjustment. - For uint8 images, the output is clipped to [0, 255] range. - For float32 images, the output is clipped to [0, 1] range. - The `brightness_by_max` parameter affects how brightness is adjusted: * If True, brightness adjustment is more pronounced and can lead to more saturated results. * If False, brightness adjustment is more subtle and preserves the overall lighting better. - This transform is useful for: * Simulating different lighting conditions * Enhancing low-light or overexposed images * Data augmentation to improve model robustness

References

  • [{'description': 'Brightness', 'source': 'https://en.wikipedia.org/wiki/Brightness'}, {'description': 'Contrast', 'source': 'https://en.wikipedia.org/wiki/Contrast_(vision)'}]

RandomGammaclass

RandomGamma(
    gamma_limit: tuple[float, float] | float = (80, 120),
    p: float = 0.5
)

Apply random gamma correction (power-law on intensity). gamma_limit controls range. Common for exposure and display variation. Gamma correction, or simply gamma, is a nonlinear operation used to encode and decode luminance or tristimulus values in imaging systems. This transform can adjust the brightness of an image while preserving the relative differences between darker and lighter areas, making it useful for simulating different lighting conditions or correcting for display characteristics.

Parameters

NameTypeDefaultDescription
gamma_limit
One of:
  • tuple[float, float]
  • float
(80, 120)If gamma_limit is a single float value, the range will be (1, gamma_limit). If it's a tuple of two floats, they will serve as the lower and upper bounds for gamma adjustment. Values are in terms of percentage change, e.g., (80, 120) means the gamma will be between 80% and 120% of the original. Default: (80, 120).
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)

# Default usage
>>> transform = A.RandomGamma(p=1.0)
>>> augmented_image = transform(image=image)["image"]

# Custom gamma range
>>> transform = A.RandomGamma(gamma_limit=(50, 150), p=1.0)
>>> augmented_image = transform(image=image)["image"]

# Applying with other transforms
>>> transform = A.Compose([
...     A.RandomGamma(gamma_limit=(80, 120), p=0.5),
...     A.RandomBrightnessContrast(p=0.5),
... ])
>>> augmented_image = transform(image=image)["image"]

Notes

- The gamma correction is applied using the formula: output = input^gamma - Gamma values > 1 will make the image darker, while values < 1 will make it brighter - This transform is particularly useful for: * Simulating different lighting conditions * Correcting for non-linear display characteristics * Enhancing contrast in certain regions of the image * Data augmentation in computer vision tasks

References

  • [{'description': 'Gamma correction', 'source': 'https://en.wikipedia.org/wiki/Gamma_correction'}, {'description': 'Power law (Gamma) encoding', 'source': 'https://www.cambridgeincolour.com/tutorials/gamma-correction.htm'}]

RandomToneCurveclass

RandomToneCurve(
    scale: float = 0.1,
    per_channel: bool = False,
    p: float = 0.5
)

Randomly warp the tone curve to change contrast and tonal distribution. scale and scale_upper control strength. Good for exposure variation. This transform applies a random S-curve to the image's tone curve, adjusting the brightness and contrast in a non-linear manner. It can be applied to the entire image or to each channel separately.

Parameters

NameTypeDefaultDescription
scalefloat0.1Standard deviation of the normal distribution used to sample random distances to move two control points that modify the image's curve. Values should be in range [0, 1]. Higher values will result in more dramatic changes to the image. Default: 0.1
per_channelboolFalseIf True, the tone curve will be applied to each channel of the input image separately, which can lead to color distortion. If False, the same curve is applied to all channels, preserving the original color relationships. Default: False
pfloat0.5Probability of applying the transform. Default: 0.5

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)

# Apply a random tone curve to all channels together
>>> transform = A.RandomToneCurve(scale=0.1, per_channel=False, p=1.0)
>>> augmented_image = transform(image=image)['image']

# Apply random tone curves to each channel separately
>>> transform = A.RandomToneCurve(scale=0.2, per_channel=True, p=1.0)
>>> augmented_image = transform(image=image)['image']

Notes

- This transform modifies the image's histogram by applying a smooth, S-shaped curve to it. - The S-curve is defined by moving two control points of a quadratic Bézier curve. - When per_channel is False, the same curve is applied to all channels, maintaining color balance. - When per_channel is True, different curves are applied to each channel, which can create color shifts. - This transform can be used to adjust image contrast and brightness in a more natural way than linear transforms. - The effect can range from subtle contrast adjustments to more dramatic "vintage" or "faded" looks.

References

  • [{'description': '"What Else Can Fool Deep Learning? Addressing Color Constancy Errors on Deep Neural Network Performance"', 'source': 'https://arxiv.org/abs/1912.06960'}, {'description': 'Bézier curve', 'source': 'https://en.wikipedia.org/wiki/B%C3%A9zier_curve#Quadratic_B%C3%A9zier_curves'}, {'description': 'Tone mapping', 'source': 'https://en.wikipedia.org/wiki/Tone_mapping'}]

Solarizeclass

Solarize(
    threshold_range: tuple[float, float] = (0.5, 0.5),
    p: float = 0.5
)

Invert pixel values above a threshold. threshold_range controls cutoff. Strong highlight inversion; useful for data augmentation. This transform applies a solarization effect to the input image. Solarization is a phenomenon in photography in which the image recorded on a negative or on a photographic print is wholly or partially reversed in tone. Dark areas appear light or light areas appear dark. In this implementation, all pixel values above a threshold are inverted.

Parameters

NameTypeDefaultDescription
threshold_rangetuple[float, float](0.5, 0.5)Range for solarizing threshold as a fraction of maximum value. The threshold_range should be in the range [0, 1] and will be multiplied by the maximum value of the image type (255 for uint8 images or 1.0 for float images). Default: (0.5, 0.5) (corresponds to 127.5 for uint8 and 0.5 for float32).
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>>
# Solarize uint8 image with fixed threshold at 50% of max value (127.5)
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.Solarize(threshold_range=(0.5, 0.5), p=1.0)
>>> solarized_image = transform(image=image)['image']
>>>
# Solarize uint8 image with random threshold between 40-60% of max value (102-153)
>>> transform = A.Solarize(threshold_range=(0.4, 0.6), p=1.0)
>>> solarized_image = transform(image=image)['image']
>>>
# Solarize float32 image at 50% of max value (0.5)
>>> image = np.random.rand(100, 100, 3).astype(np.float32)
>>> transform = A.Solarize(threshold_range=(0.5, 0.5), p=1.0)
>>> solarized_image = transform(image=image)['image']

Notes

- For uint8 images, pixel values above the threshold are inverted as: 255 - pixel_value - For float32 images, pixel values above the threshold are inverted as: 1.0 - pixel_value - The threshold is applied to each channel independently - The threshold is calculated in two steps: 1. Sample a value from threshold_range 2. Multiply by the image's maximum value: * For uint8: threshold = sampled_value * 255 * For float32: threshold = sampled_value * 1.0 - This transform can create interesting artistic effects or be used for data augmentation

ToGrayclass

ToGray(
    num_output_channels: int = 3,
    method: 'weighted_average' | 'from_lab' | 'desaturation' | 'average' | 'max' | 'pca' = weighted_average,
    p: float = 0.5
)

Convert to grayscale (weighted by channel weights). Optionally replicate to keep shape. Useful for grayscale training or channel reduction. This transform first converts a color image to a single-channel grayscale image using various methods, then replicates the grayscale channel if num_output_channels is greater than 1.

Parameters

NameTypeDefaultDescription
num_output_channelsint3The number of channels in the output image. If greater than 1, the grayscale channel will be replicated. Default: 3.
method
One of:
  • 'weighted_average'
  • 'from_lab'
  • 'desaturation'
  • 'average'
  • 'max'
  • 'pca'
weighted_averageThe method used for grayscale conversion: - "weighted_average": Uses a weighted sum of RGB channels (0.299R + 0.587G + 0.114B). Works only with 3-channel images. Provides realistic results based on human perception. - "from_lab": Extracts the L channel from the LAB color space. Works only with 3-channel images. Gives perceptually uniform results. - "desaturation": Averages the maximum and minimum values across channels. Works with any number of channels. Fast but may not preserve perceived brightness well. - "average": Simple average of all channels. Works with any number of channels. Fast but may not give realistic results. - "max": Takes the maximum value across all channels. Works with any number of channels. Tends to produce brighter results. - "pca": Applies Principal Component Analysis to reduce channels. Works with any number of channels. Can preserve more information but is computationally intensive.
pfloat0.5Probability of applying the transform. Default: 0.5.

Returns

  • np.ndarray: Grayscale image with the specified number of channels.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Create a sample color image with distinct RGB values
>>> image = np.zeros((100, 100, 3), dtype=np.uint8)
>>> # Red square in top-left
>>> image[10:40, 10:40, 0] = 200
>>> # Green square in top-right
>>> image[10:40, 60:90, 1] = 200
>>> # Blue square in bottom-left
>>> image[60:90, 10:40, 2] = 200
>>> # Yellow square in bottom-right (Red + Green)
>>> image[60:90, 60:90, 0] = 200
>>> image[60:90, 60:90, 1] = 200
>>>
>>> # Example 1: Default conversion (weighted average, 3 channels)
>>> transform = A.ToGray(p=1.0)
>>> result = transform(image=image)
>>> gray_image = result['image']
>>> # Output has 3 duplicate channels with values based on RGB perception weights
>>> # R=0.299, G=0.587, B=0.114
>>> assert gray_image.shape == (100, 100, 3)
>>> assert np.allclose(gray_image[:, :, 0], gray_image[:, :, 1])
>>> assert np.allclose(gray_image[:, :, 1], gray_image[:, :, 2])
>>>
>>> # Example 2: Single-channel output
>>> transform = A.ToGray(num_output_channels=1, p=1.0)
>>> result = transform(image=image)
>>> gray_image = result['image']
>>> assert gray_image.shape == (100, 100, 1)
>>>
>>> # Example 3: Using different conversion methods
>>> # "desaturation" method (min+max)/2
>>> transform_desaturate = A.ToGray(
...     method="desaturation",
...     p=1.0
... )
>>> result = transform_desaturate(image=image)
>>> gray_desaturate = result['image']
>>>
>>> # "from_lab" method (using L channel from LAB colorspace)
>>> transform_lab = A.ToGray(
...     method="from_lab",
...     p=1.0
>>> )
>>> result = transform_lab(image=image)
>>> gray_lab = result['image']
>>>
>>> # "average" method (simple average of channels)
>>> transform_avg = A.ToGray(
...     method="average",
...     p=1.0
>>> )
>>> result = transform_avg(image=image)
>>> gray_avg = result['image']
>>>
>>> # "max" method (takes max value across channels)
>>> transform_max = A.ToGray(
...     method="max",
...     p=1.0
>>> )
>>> result = transform_max(image=image)
>>> gray_max = result['image']
>>>
>>> # Example 4: Using grayscale in an augmentation pipeline
>>> pipeline = A.Compose([
...     A.ToGray(p=0.5),           # 50% chance of grayscale conversion
...     A.RandomBrightnessContrast(p=1.0)  # Always apply brightness/contrast
... ])
>>> result = pipeline(image=image)
>>> augmented_image = result['image']  # May be grayscale or color
>>>
>>> # Example 5: Converting float32 image
>>> float_image = image.astype(np.float32) / 255.0  # Range [0, 1]
>>> transform = A.ToGray(p=1.0)
>>> result = transform(image=float_image)
>>> gray_float_image = result['image']
>>> assert gray_float_image.dtype == np.float32
>>> assert gray_float_image.max() <= 1.0

Notes

- The transform first converts the input image to single-channel grayscale, then replicates this channel if num_output_channels > 1. - "weighted_average" and "from_lab" are typically used in image processing and computer vision applications where accurate representation of human perception is important. - "desaturation" and "average" are often used in simple image manipulation tools or when computational speed is a priority. - "max" method can be useful in scenarios where preserving bright features is important, such as in some medical imaging applications. - "pca" might be used in advanced image analysis tasks or when dealing with hyperspectral images.

ToRGBclass

ToRGB(
    num_output_channels: int = 3,
    p: float = 1.0
)

Convert grayscale image to RGB by replicating the single channel to three. No color information added; use when a model expects 3-channel input.

Parameters

NameTypeDefaultDescription
num_output_channelsint3The number of channels in the output image. Default: 3.
pfloat1.0Probability of applying the transform. Default: 1.0.

Examples

>>> import numpy as np
>>> import albumentations as A
>>>
>>> # Convert a grayscale image to RGB
>>> transform = A.Compose([A.ToRGB(p=1.0)])
>>> grayscale_image = np.random.randint(0, 256, (100, 100), dtype=np.uint8)
>>> rgb_image = transform(image=grayscale_image)['image']
>>> assert rgb_image.shape == (100, 100, 3)

Notes

- For single-channel (grayscale) images, the channel is replicated to create an RGB image. - If the input is already a 3-channel RGB image, it is returned unchanged. - This transform does not change the data type of the image (e.g., uint8 remains uint8).

ToSepiaclass

ToSepia(
    p: float = 0.5
)

Apply sepia (brownish vintage) filter via fixed color matrix. Optional alpha for blending with original. Good for style or temporal variation in datasets. This transform converts a color image to a sepia tone, giving it a warm, brownish tint that is reminiscent of old photographs. The sepia effect is achieved by applying a specific color transformation matrix to the RGB channels of the input image. For grayscale images, the transform is a no-op and returns the original image.

Parameters

NameTypeDefaultDescription
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>>
# Apply sepia effect to a uint8 RGB image
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.ToSepia(p=1.0)
>>> sepia_image = transform(image=image)['image']
>>> assert sepia_image.shape == image.shape
>>> assert sepia_image.dtype == np.uint8
>>>
# Apply sepia effect to a float32 RGB image
>>> image = np.random.rand(100, 100, 3).astype(np.float32)
>>> transform = A.ToSepia(p=1.0)
>>> sepia_image = transform(image=image)['image']
>>> assert sepia_image.shape == image.shape
>>> assert sepia_image.dtype == np.float32
>>> assert 0 <= sepia_image.min() <= sepia_image.max() <= 1.0
>>>
# No effect on grayscale images
>>> gray_image = np.random.randint(0, 256, (100, 100), dtype=np.uint8)
>>> transform = A.ToSepia(p=1.0)
>>> result = transform(image=gray_image)['image']
>>> assert np.array_equal(result, gray_image)

Notes

- The sepia effect only works with RGB images (3 channels). For grayscale images, the original image is returned unchanged since the sepia transformation would have no visible effect when R=G=B. - The sepia effect is created using a fixed color transformation matrix: [[0.393, 0.769, 0.189], [0.349, 0.686, 0.168], [0.272, 0.534, 0.131]] - The output image will have the same data type as the input image. - For float32 images, ensure the input values are in the range [0, 1].

Vignettingclass

Vignetting(
    intensity_range: tuple[float, float] = (0.2, 0.5),
    center_range: tuple[float, float] = (0.3, 0.7),
    p: float = 0.5
)

Darken corners with a radial (elliptical) gradient. Simulates lens vignetting or natural light falloff. Use for lens realism or stylistic darkening. Center of the image stays bright; corners and edges are darkened. Center position can be jittered for variety.

Parameters

NameTypeDefaultDescription
intensity_rangetuple[float, float](0.2, 0.5)Darkening at corners: 0 = no effect, 1 = black. Default: (0.2, 0.5).
center_rangetuple[float, float](0.3, 0.7)Range for vignette center as fraction of width/height. (0.5, 0.5) = image center. Default: (0.3, 0.7).
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>>
>>> transform = A.Vignetting(intensity_range=(0.2, 0.5), p=1.0)
>>> result = transform(image=image)["image"]

Notes

- Elliptical gradient centered at a random point (within center_range). - Quadratic falloff from center to edges.