albumentations.augmentations.pixel.transforms


Reduce colors via dithering: ordered Bayer, error diffusion, or random. num_levels, method. Good for retro look or limited-color output.

Ditheringclass

Dithering(
    method: 'random' | 'ordered' | 'error_diffusion' = error_diffusion,
    n_colors: int = 2,
    color_mode: 'grayscale' | 'per_channel' = grayscale,
    error_diffusion_algorithm: 'floyd_steinberg' | 'jarvis' | 'stucki' | 'atkinson' | 'burkes' | 'sierra' | 'sierra_2row' | 'sierra_lite' = floyd_steinberg,
    bayer_matrix_size: 2 | 4 | 8 | 16 = 4,
    serpentine: bool = False,
    noise_range: tuple[float, float] = (-0.5, 0.5),
    p: float = 0.5
)

Reduce colors via dithering: ordered Bayer, error diffusion, or random. num_levels, method. Good for retro look or limited-color output. Dithering is like creating a newspaper photo - it uses patterns of dots to create the illusion of more colors than are actually present. When you have a limited color palette (like only black and white), dithering arranges these limited colors in patterns that trick your eye into seeing intermediate shades. Think of it like pointillist paintings - up close you see individual dots, but from a distance they blend together to create smooth gradients and subtle color variations. This transform works with ANY number of channels - it processes each channel independently, whether you have a standard RGB image (3 channels), RGBA with transparency (4 channels), multispectral satellite imagery (dozens of channels), or even single-channel grayscale images.

Parameters

NameTypeDefaultDescription
method
One of:
  • 'random'
  • 'ordered'
  • 'error_diffusion'
error_diffusionWhich dithering algorithm to use. Each has different characteristics: - "random": Adds random noise before quantization. Creates a grainy, film-like texture. Good for artistic effects or simulating old photographs. - "ordered": Uses a repeating pattern (Bayer matrix) to decide which pixels to darken. Creates distinctive crosshatch patterns. Fast and predictable. Common in old computer graphics and newspaper printing. - "error_diffusion": Most sophisticated method. When a pixel is made darker or lighter than it should be, the "error" is spread to neighboring pixels. Creates the most natural-looking results. Like using a fine brush. Default: "error_diffusion"
n_colorsint2How many different color levels to keep per channel. Must be between 2 and 256. - 2 = only black and white (or min/max values for each channel) - 4 = 4 levels of gray (or 4 levels per color channel) - 16 = 16 shades, creating a retro computer graphics look - 256 = full range, no reduction (but patterns still visible from dithering process) Lower values create more dramatic effects. Default: 2
color_mode
One of:
  • 'grayscale'
  • 'per_channel'
grayscaleHow to handle color channels: - "per_channel": Each color channel (R, G, B, etc.) is dithered separately. Maintains color relationships but each channel gets its own pattern. Works with any number of channels. - "grayscale": First converts the image to grayscale (using standard luminance weights), then applies dithering, then expands back to the original number of channels. All color information is lost, but the dithering pattern is consistent across channels. Default: "grayscale"
error_diffusion_algorithm
One of:
  • 'floyd_steinberg'
  • 'jarvis'
  • 'stucki'
  • 'atkinson'
  • 'burkes'
  • 'sierra'
  • 'sierra_2row'
  • 'sierra_lite'
floyd_steinbergUsed only in "error_diffusion" method. Which specific algorithm: - "floyd_steinberg": The classic, invented in 1976. Spreads error to 4 neighbors. Good balance of quality and speed. Industry standard. - "jarvis": Jarvis-Judice-Ninke algorithm. Spreads error to 12 neighbors. Higher quality but 3x slower than Floyd-Steinberg. - "stucki": Similar to Jarvis but with different weights. Also 12 neighbors. - "atkinson": Created by Bill Atkinson for original Macintosh. Only spreads 75% of error, creating lighter images with more contrast. - "burkes": Spreads to 7 neighbors. Faster than Jarvis, better than Floyd-Steinberg. - "sierra": Spreads to 10 neighbors. Good quality, moderate speed. - "sierra_2row": Simplified Sierra using only 2 rows. Faster. - "sierra_lite": Minimal Sierra using only 3 neighbors. Very fast. Default: "floyd_steinberg"
bayer_matrix_size
One of:
  • 2
  • 4
  • 8
  • 16
4Used only in "ordered" method. The size of the repeating pattern (2, 4, 8, or 16). - 2x2: Very visible checkerboard pattern - 4x4: Standard, good balance - 8x8: Finer pattern, less visible - 16x16: Very fine pattern, almost noise-like Default: 4
serpentineboolFalseUsed only in "error_diffusion" method. Whether to process rows in alternating directions (left-to-right, then right-to-left). This can reduce visible "worm" artifacts that sometimes appear as diagonal lines. Slightly slower. Default: False
noise_rangetuple[float, float](-0.5, 0.5)Used only in "random" method. How much random noise to add before quantization. Larger range = more variation in the dithering pattern. Range: (-1.0, 1.0). Default: (-0.5, 0.5)
pfloat0.5Probability of applying this transform. Default: 0.5

Examples

>>> import numpy as np
>>> import albumentations as A
>>> # Prepare sample data
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>>
>>> # Black and white dithering with Floyd-Steinberg
>>> transform = A.Compose([
...     A.Dithering(
...         method="error_diffusion",
...         n_colors=2,
...         error_diffusion_algorithm="floyd_steinberg",
...         color_mode="grayscale",
...         p=1.0
...     )
... ])
>>> transformed = transform(image=image)
>>> dithered_image = transformed['image']  # Black and white dithered image
>>>
>>> # Ordered dithering with 16 colors per channel
>>> transform = A.Compose([
...     A.Dithering(
...         method="ordered",
...         n_colors=16,
...         bayer_matrix_size=8,
...         color_mode="per_channel",
...         p=1.0
...     )
... ])
>>> transformed = transform(image=image)
>>> dithered_image = transformed['image']  # Reduced color depth with Bayer pattern
>>>
>>> # Random dithering
>>> transform = A.Compose([
...     A.Dithering(
...         method="random",
...         n_colors=4,
...         noise_range=(-0.3, 0.3),
...         p=1.0
...     )
... ])
>>> transformed = transform(image=image)
>>> dithered_image = transformed['image']  # Noisy dithered appearance

References

  • [{'description': 'Wikipedia', 'source': 'https://en.wikipedia.org/wiki/Dither'}, {'description': 'Floyd-Steinberg dithering', 'source': 'https://en.wikipedia.org/wiki/Floyd%E2%80%93Steinberg_dithering'}, {'description': 'Ordered dithering', 'source': 'https://en.wikipedia.org/wiki/Ordered_dithering'}, {'description': 'Error diffusion dithering', 'source': 'https://en.wikipedia.org/wiki/Error_diffusion'}]

Embossclass

Emboss(
    alpha: tuple[float, float] = (0.2, 0.5),
    strength: tuple[float, float] = (0.2, 0.7),
    p: float = 0.5
)

Apply emboss effect (directional highlight and shadow). strength_range controls intensity. Pseudo-3D look; for texture or style augmentation. This transform creates an emboss effect by highlighting edges and creating a 3D-like texture in the image. It works by applying a specific convolution kernel to the image that emphasizes differences in adjacent pixel values.

Parameters

NameTypeDefaultDescription
alphatuple[float, float](0.2, 0.5)Range to choose the visibility of the embossed image. At 0, only the original image is visible, at 1.0 only its embossed version is visible. Values should be in the range [0, 1]. Alpha will be randomly selected from this range for each image. Default: (0.2, 0.5)
strengthtuple[float, float](0.2, 0.7)Range to choose the strength of the embossing effect. Higher values create a more pronounced 3D effect. Values should be non-negative. Strength will be randomly selected from this range for each image. Default: (0.2, 0.7)
pfloat0.5Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.Emboss(alpha=(0.2, 0.5), strength=(0.2, 0.7), p=0.5)
>>> result = transform(image=image)
>>> embossed_image = result['image']

Notes

- The emboss effect is created using a 3x3 convolution kernel. - The 'alpha' parameter controls the blend between the original image and the embossed version. A higher alpha value will result in a more pronounced emboss effect. - The 'strength' parameter affects the intensity of the embossing. Higher strength values will create more contrast in the embossed areas, resulting in a stronger 3D-like effect. - This transform can be useful for creating artistic effects or for data augmentation in tasks where edge information is important.

References

  • [{'description': 'Image Embossing', 'source': 'https://en.wikipedia.org/wiki/Image_embossing'}, {'description': 'Application of Emboss Filtering in Image Processing', 'source': 'https://www.researchgate.net/publication/303412455_Application_of_Emboss_Filtering_in_Image_Processing'}]

Halftoneclass

Halftone(
    dot_size_range: tuple[int, int] = (4, 10),
    blend_range: tuple[float, float] = (0.0, 0.5),
    p: float = 0.5
)

Halftone dot pattern (printing-style). Continuous tones become dots of varying size. Use for vintage or print-aesthetic augmentation. Simulates halftone printing: a grid of cells, each drawn as a filled circle whose size is proportional to mean luminance in that cell. Larger dots = brighter, smaller = darker. Optional blend with the original image controls strength.

Parameters

NameTypeDefaultDescription
dot_size_rangetuple[int, int](4, 10)Range for grid cell size in pixels. Larger = coarser pattern. Default: (4, 10).
blend_rangetuple[float, float](0.0, 0.5)Blend with original: 0 = pure halftone, 1 = original. Default: (0.0, 0.5).
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>>
>>> transform = A.Halftone(dot_size_range=(4, 8), blend_range=(0.0, 0.3), p=1.0)
>>> result = transform(image=image)["image"]

Notes

- Mean luminance per grid cell drives dot radius; cell color from original image. - Dot size is proportional to luminance (bright → large dot, dark → small dot).

InvertImgclass

InvertImg(
    p: float = 0.5
)

Invert the input image by subtracting pixel values from max values of the image types, i.e., 255 for uint8 and 1.0 for float32.

Parameters

NameTypeDefaultDescription
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Create a sample image with different elements
>>> image = np.zeros((100, 100, 3), dtype=np.uint8)
>>> cv2.circle(image, (30, 30), 20, (255, 255, 255), -1)  # White circle
>>> cv2.rectangle(image, (60, 60), (90, 90), (128, 128, 128), -1)  # Gray rectangle
>>>
>>> # Apply InvertImg transform
>>> transform = A.InvertImg(p=1.0)
>>> result = transform(image=image)
>>> inverted_image = result['image']
>>>
>>> # Result:
>>> # - Black background becomes white (0 → 255)
>>> # - White circle becomes black (255 → 0)
>>> # - Gray rectangle is inverted (128 → 127)
>>> # The same approach works for float32 images (0-1 range) and grayscale images

LensFlareclass

LensFlare(
    flare_roi: tuple[float, float, float, float] = (0, 0, 1, 0.5),
    num_ghosts_range: tuple[int, int] = (3, 7),
    intensity_range: tuple[float, float] = (0.3, 0.7),
    num_rays_range: tuple[int, int] = (4, 8),
    bloom_range: tuple[float, float] = (0.01, 0.05),
    p: float = 0.5
)

Add lens flare: starburst rays and ghost reflections from a bright source. Use for outdoor or backlit robustness and optical-artifact simulation. A flare center is chosen in a configurable region; starburst rays and mirrored ghost circles are drawn toward the image center. Strength and blur are configurable.

Parameters

NameTypeDefaultDescription
flare_roituple[float, float, float, float](0, 0, 1, 0.5)Region of interest for flare source placement as (x_min, y_min, x_max, y_max) in normalized [0, 1] coords. Default: (0, 0, 1, 0.5).
num_ghosts_rangetuple[int, int](3, 7)Range for number of ghost reflections. Default: (3, 7).
intensity_rangetuple[float, float](0.3, 0.7)Range for overall flare brightness. Default: (0.3, 0.7).
num_rays_rangetuple[int, int](4, 8)Range for number of starburst rays. Default: (4, 8).
bloom_rangetuple[float, float](0.01, 0.05)Range for bloom blur radius as fraction of image diagonal. Default: (0.01, 0.05).
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.LensFlare(intensity_range=(0.3, 0.6), p=1.0)
>>> result = transform(image=image)["image"]

Notes

- Ghost reflections lie along the line from flare source to image center. - Size decreases and color shifts with distance from the source.

Normalizeclass

Normalize(
    mean: tuple[float, ...] | float | None = (0.485, 0.456, 0.406),
    std: tuple[float, ...] | float | None = (0.229, 0.224, 0.225),
    max_pixel_value: float | None = 255.0,
    normalization: 'standard' | 'image' | 'image_per_channel' | 'min_max' | 'min_max_per_channel' = standard,
    p: float = 1.0
)

Applies various normalization techniques to an image. The specific normalization technique can be selected with the `normalization` parameter. Standard normalization is applied using the formula: `img = (img - mean * max_pixel_value) / (std * max_pixel_value)`. Other normalization techniques adjust the image based on global or per-channel statistics, or scale pixel values to a specified range.

Parameters

NameTypeDefaultDescription
mean
One of:
  • tuple[float, ...]
  • float
  • None
(0.485, 0.456, 0.406)Mean values for standard normalization. For "standard" normalization, the default values are ImageNet mean values: (0.485, 0.456, 0.406).
std
One of:
  • tuple[float, ...]
  • float
  • None
(0.229, 0.224, 0.225)Standard deviation values for standard normalization. For "standard" normalization, the default values are ImageNet standard deviation :(0.229, 0.224, 0.225).
max_pixel_value
One of:
  • float
  • None
255.0Maximum possible pixel value, used for scaling in standard normalization. Defaults to 255.0.
normalization
One of:
  • 'standard'
  • 'image'
  • 'image_per_channel'
  • 'min_max'
  • 'min_max_per_channel'
standardSpecifies the normalization technique to apply. Defaults to "standard". - "standard": Applies the formula `(img - mean * max_pixel_value) / (std * max_pixel_value)`. The default mean and std are based on ImageNet. You can use mean and std values of (0.5, 0.5, 0.5) for inception normalization. And mean values of (0, 0, 0) and std values of (1, 1, 1) for YOLO. - "image": Normalizes the whole image based on its global mean and standard deviation. - "image_per_channel": Normalizes the image per channel based on each channel's mean and standard deviation. - "min_max": Scales the image pixel values to a [0, 1] range based on the global minimum and maximum pixel values. - "min_max_per_channel": Scales each channel of the image pixel values to a [0, 1] range based on the per-channel minimum and maximum pixel values.
pfloat1.0Probability of applying the transform. Defaults to 1.0.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> # Standard ImageNet normalization
>>> transform = A.Normalize(
...     mean=(0.485, 0.456, 0.406),
...     std=(0.229, 0.224, 0.225),
...     max_pixel_value=255.0,
...     p=1.0
... )
>>> normalized_image = transform(image=image)["image"]
>>>
>>> # Min-max normalization
>>> transform_minmax = A.Normalize(normalization="min_max", p=1.0)
>>> normalized_image_minmax = transform_minmax(image=image)["image"]

Notes

- For "standard" normalization, `mean`, `std`, and `max_pixel_value` must be provided. - For other normalization types, these parameters are ignored. - For inception normalization, use mean values of (0.5, 0.5, 0.5). - For YOLO normalization, use mean values of (0, 0, 0) and std values of (1, 1, 1). - This transform is often used as a final step in image preprocessing pipelines to prepare images for neural network input.

References

  • [{'description': 'ImageNet mean and std', 'source': 'https://pytorch.org/vision/stable/models.html'}, {'description': 'Inception preprocessing', 'source': 'https://keras.io/api/applications/inceptionv3/'}]

RingingOvershootclass

RingingOvershoot(
    blur_limit: tuple[int, int] | int = (7, 15),
    cutoff: tuple[float, float] = (0.7853981633974483, 1.5707963267948966),
    p: float = 0.5
)

Create ringing or overshoot artifacts via 2D sinc convolution. blur_limit and cutoff control strength. Simulates sharpening or compression artifacts. This transform simulates the ringing artifacts that can occur in digital image processing, particularly after sharpening or edge enhancement operations. It creates oscillations or overshoots near sharp transitions in the image.

Parameters

NameTypeDefaultDescription
blur_limit
One of:
  • tuple[int, int]
  • int
(7, 15)Maximum kernel size for the sinc filter. Must be an odd number in the range [3, inf). If a single int is provided, the kernel size will be randomly chosen from the range (3, blur_limit). If a tuple (min, max) is provided, the kernel size will be randomly chosen from the range (min, max). Default: (7, 15).
cutofftuple[float, float](0.7853981633974483, 1.5707963267948966)Range to choose the cutoff frequency in radians. Values should be in the range (0, π). A lower cutoff frequency will result in more pronounced ringing effects. Default: (π/4, π/2).
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)

# Apply ringing effect with default parameters
>>> transform = A.RingingOvershoot(p=1.0)
>>> ringing_image = transform(image=image)['image']

# Apply ringing effect with custom parameters
>>> transform = A.RingingOvershoot(
...     blur_limit=(9, 17),
...     cutoff=(np.pi/6, np.pi/3),
...     p=1.0
... )
>>> ringing_image = transform(image=image)['image']

Notes

- Ringing artifacts are oscillations of the image intensity function in the neighborhood of sharp transitions, such as edges or object boundaries. - This transform uses a 2D sinc filter (also known as a 2D cardinal sine function) to introduce these artifacts. - The severity of the ringing effect is controlled by both the kernel size (blur_limit) and the cutoff frequency. - Larger kernel sizes and lower cutoff frequencies will generally produce more noticeable ringing effects. - This transform can be useful for: * Simulating imperfections in image processing or transmission systems * Testing the robustness of computer vision models to ringing artifacts * Creating artistic effects that emphasize edges and transitions in images

References

  • [{'description': 'Ringing artifacts', 'source': 'https://en.wikipedia.org/wiki/Ringing_artifacts'}, {'description': 'Sinc filter', 'source': 'https://en.wikipedia.org/wiki/Sinc_filter'}, {'description': 'Digital Image Processing', 'source': 'Rafael C. Gonzalez and Richard E. Woods, 4th Edition'}]

Sharpenclass

Sharpen(
    alpha: tuple[float, float] = (0.2, 0.5),
    lightness: tuple[float, float] = (0.5, 1.0),
    method: 'kernel' | 'gaussian' = kernel,
    kernel_size: int = 5,
    sigma: float = 1.0,
    p: float = 0.5
)

Sharpen the image via kernel or Gaussian unsharp method. alpha and lightness control strength. Enhances edges; useful for document or detail-sensitive tasks. Implements two different approaches to image sharpening: 1. Traditional kernel-based method using Laplacian operator 2. Gaussian interpolation method (similar to Kornia's approach)

Parameters

NameTypeDefaultDescription
alphatuple[float, float](0.2, 0.5)Range for the visibility of sharpening effect. At 0, only the original image is visible, at 1.0 only its processed version is visible. Values should be in the range [0, 1]. Used in both methods. Default: (0.2, 0.5).
lightnesstuple[float, float](0.5, 1.0)Range for the lightness of the sharpened image. Only used in 'kernel' method. Larger values create higher contrast. Values should be greater than 0. Default: (0.5, 1.0).
method
One of:
  • 'kernel'
  • 'gaussian'
kernelSharpening algorithm to use: - 'kernel': Traditional kernel-based sharpening using Laplacian operator - 'gaussian': Interpolation between Gaussian blurred and original image Default: 'kernel'
kernel_sizeint5Size of the Gaussian blur kernel for 'gaussian' method. Must be odd. Default: 5
sigmafloat1.0Standard deviation for Gaussian kernel in 'gaussian' method. Default: 1.0
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import albumentations as A
>>> import numpy as np

# Traditional kernel sharpening
>>> transform = A.Sharpen(
...     alpha=(0.2, 0.5),
...     lightness=(0.5, 1.0),
...     method='kernel',
...     p=1.0
... )

# Gaussian interpolation sharpening
>>> transform = A.Sharpen(
...     alpha=(0.5, 1.0),
...     method='gaussian',
...     kernel_size=5,
...     sigma=1.0,
...     p=1.0
... )

Notes

- Kernel sizes must be odd to maintain spatial alignment - Methods produce different visual results: * Kernel method: More pronounced edges, possible artifacts * Gaussian method: More natural look, limited to original sharpness

References

  • [{'description': 'R. C. Gonzalez and R. E. Woods, "Digital Image Processing (4th Edition),"', 'source': 'Chapter 3: Intensity Transformations and Spatial Filtering.'}, {'description': 'J. C. Russ, "The Image Processing Handbook (7th Edition),"', 'source': 'Chapter 4: Image Enhancement.'}, {'description': 'T. Acharya and A. K. Ray, "Image Processing', 'source': 'Principles and Applications,": Chapter 5: Image Enhancement.'}, {'description': 'Unsharp masking', 'source': 'https://en.wikipedia.org/wiki/Unsharp_masking'}, {'description': 'Laplacian operator', 'source': 'https://en.wikipedia.org/wiki/Laplace_operator'}, {'description': 'Gaussian blur', 'source': 'https://en.wikipedia.org/wiki/Gaussian_blur'}]

Superpixelsclass

Superpixels(
    p_replace: tuple[float, float] | float = (0, 0.1),
    n_segments: tuple[int, int] | int = (100, 100),
    max_size: int | None = 128,
    interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
    p: float = 0.5
)

Replace image with superpixel segmentation (SLIC). p_replace, n_segments, max_size control fraction and segment count. Reduces fine texture.

Parameters

NameTypeDefaultDescription
p_replace
One of:
  • tuple[float, float]
  • float
(0, 0.1)Defines for any segment the probability that the pixels within that segment are replaced by their average color (otherwise, the pixels are not changed). * A probability of `0.0` would mean, that the pixels in no segment are replaced by their average color (image is not changed at all). * A probability of `0.5` would mean, that around half of all segments are replaced by their average color. * A probability of `1.0` would mean, that all segments are replaced by their average color (resulting in a voronoi image). Behavior based on chosen data types for this parameter: * If a `float`, then that `float` will always be used. * If `tuple` `(a, b)`, then a random probability will be sampled from the interval `[a, b]` per image. Default: (0.1, 0.3)
n_segments
One of:
  • tuple[int, int]
  • int
(100, 100)Rough target number of how many superpixels to generate. The algorithm may deviate from this number. Lower value will lead to coarser superpixels. Higher values are computationally more intensive and will hence lead to a slowdown. If tuple `(a, b)`, then a value from the discrete interval `[a..b]` will be sampled per image. Default: (15, 120)
max_size
One of:
  • int
  • None
128Maximum image size at which the augmentation is performed. If the width or height of an image exceeds this value, it will be downscaled before the augmentation so that the longest side matches `max_size`. This is done to speed up the process. The final output image has the same size as the input image. Note that in case `p_replace` is below `1.0`, the down-/upscaling will affect the not-replaced pixels too. Use `None` to apply no down-/upscaling. Default: 128
interpolation
One of:
  • 0
  • 6
  • 1
  • 2
  • 3
  • 4
  • 5
1Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
pfloat0.5Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)

# Apply superpixels with default parameters
>>> transform = A.Superpixels(p=1.0)
>>> augmented_image = transform(image=image)['image']

# Apply superpixels with custom parameters
>>> transform = A.Superpixels(
...     p_replace=(0.5, 0.7),
...     n_segments=(50, 100),
...     max_size=None,
...     interpolation=cv2.INTER_NEAREST,
...     p=1.0
... )
>>> augmented_image = transform(image=image)['image']

Notes

- This transform can significantly change the visual appearance of the image. - The transform makes use of a superpixel algorithm, which tends to be slow. If performance is a concern, consider using `max_size` to limit the image size. - The effect of this transform can vary greatly depending on the `p_replace` and `n_segments` parameters. - When `p_replace` is high, the image can become highly abstracted, resembling a voronoi diagram. - The transform preserves the original image type (uint8 or float32).

UnsharpMaskclass

UnsharpMask(
    blur_limit: tuple[int, int] | int = (3, 7),
    sigma_limit: tuple[float, float] | float = 0.0,
    alpha: tuple[float, float] | float = (0.2, 0.5),
    threshold: int = 10,
    p: float = 0.5
)

Sharpen via unsharp masking: blur, subtract, add back. blur_limit, sigma_limit, alpha control strength. Luminance unchanged; edges enhanced. Unsharp masking is a technique that enhances edge contrast in an image, creating the illusion of increased sharpness. This transform applies Gaussian blur to create a blurred version of the image, then uses this to create a mask which is combined with the original image to enhance edges and fine details.

Parameters

NameTypeDefaultDescription
blur_limit
One of:
  • tuple[int, int]
  • int
(3, 7)maximum Gaussian kernel size for blurring the input image. Must be zero or odd and in range [0, inf). If set to 0 it will be computed from sigma as `round(sigma * (3 if img.dtype == np.uint8 else 4) * 2 + 1) + 1`. If set single value `blur_limit` will be in range (0, blur_limit). Default: (3, 7).
sigma_limit
One of:
  • tuple[float, float]
  • float
0.0Gaussian kernel standard deviation. Must be more or equal to 0. If set single value `sigma_limit` will be in range (0, sigma_limit). If set to 0 sigma will be computed as `sigma = 0.3*((ksize-1)*0.5 - 1) + 0.8`. Default: 0.
alpha
One of:
  • tuple[float, float]
  • float
(0.2, 0.5)range to choose the visibility of the sharpened image. At 0, only the original image is visible, at 1.0 only its sharpened version is visible. Default: (0.2, 0.5).
thresholdint10Value to limit sharpening only for areas with high pixel difference between original image and it's smoothed version. Higher threshold means less sharpening on flat areas. Must be in range [0, 255]. Default: 10.
pfloat0.5probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>>
# Apply UnsharpMask with default parameters
>>> transform = A.UnsharpMask(p=1.0)
>>> sharpened_image = transform(image=image)['image']
>>>
# Apply UnsharpMask with custom parameters
>>> transform = A.UnsharpMask(
...     blur_limit=(3, 7),
...     sigma_limit=(0.1, 0.5),
...     alpha=(0.2, 0.7),
...     threshold=15,
...     p=1.0
... )
>>> sharpened_image = transform(image=image)['image']

Notes

- The algorithm creates a mask M = (I - G) * alpha, where I is the original image and G is the Gaussian blurred version. - The final image is computed as: output = I + M if |I - G| > threshold, else I. - Higher alpha values increase the strength of the sharpening effect. - Higher threshold values limit the sharpening effect to areas with more significant edges or details. - The blur_limit and sigma_limit parameters control the Gaussian blur used to create the mask.

References

  • [{'description': 'Unsharp Masking', 'source': 'https://en.wikipedia.org/wiki/Unsharp_masking'}]