albumentations.augmentations.pixel.transforms

View Source on GitHub

Reduce colors via dithering: ordered Bayer, error diffusion, or random. num_levels, method. Good for retro look or limited-color output.

Members

classDithering
classEmboss
classHalftone
classInvertImg
classLensFlare
classNormalize
classRingingOvershoot
classSharpen
classSuperpixels
classUnsharpMask

Ditheringclass

Dithering(
    method: 'random' | 'ordered' | 'error_diffusion' = error_diffusion,
    n_colors: int = 2,
    color_mode: 'grayscale' | 'per_channel' = grayscale,
    error_diffusion_algorithm: 'floyd_steinberg' | 'jarvis' | 'stucki' | 'atkinson' | 'burkes' | 'sierra' | 'sierra_2row' | 'sierra_lite' = floyd_steinberg,
    bayer_matrix_size: 2 | 4 | 8 | 16 = 4,
    serpentine: bool = False,
    noise_range: tuple[float, float] = (-0.5, 0.5),
    p: float = 0.5
)

Reduce colors via dithering: ordered Bayer, error diffusion, or random. num_levels, method. Good for retro look or limited-color output. Dithering is like creating a newspaper photo - it uses patterns of dots to create the illusion of more colors than are actually present. When you have a limited color palette (like only black and white), dithering arranges these limited colors in patterns that trick your eye into seeing intermediate shades. Think of it like pointillist paintings - up close you see individual dots, but from a distance they blend together to create smooth gradients and subtle color variations. This transform works with ANY number of channels - it processes each channel independently, whether you have a standard RGB image (3 channels), RGBA with transparency (4 channels), multispectral satellite imagery (dozens of channels), or even single-channel grayscale images.

Parameters

Name	Type	Default	Description
method	One of: 'random' 'ordered' 'error_diffusion'	error_diffusion	Which dithering algorithm to use. Each has different characteristics: - "random": Adds random noise before quantization. Creates a grainy, film-like texture. Good for artistic effects or simulating old photographs. - "ordered": Uses a repeating pattern (Bayer matrix) to decide which pixels to darken. Creates distinctive crosshatch patterns. Fast and predictable. Common in old computer graphics and newspaper printing. - "error_diffusion": Most sophisticated method. When a pixel is made darker or lighter than it should be, the "error" is spread to neighboring pixels. Creates the most natural-looking results. Like using a fine brush. Default: "error_diffusion"
n_colors	int	2	How many different color levels to keep per channel. Must be between 2 and 256. - 2 = only black and white (or min/max values for each channel) - 4 = 4 levels of gray (or 4 levels per color channel) - 16 = 16 shades, creating a retro computer graphics look - 256 = full range, no reduction (but patterns still visible from dithering process) Lower values create more dramatic effects. Default: 2
color_mode	One of: 'grayscale' 'per_channel'	grayscale	How to handle color channels: - "per_channel": Each color channel (R, G, B, etc.) is dithered separately. Maintains color relationships but each channel gets its own pattern. Works with any number of channels. - "grayscale": First converts the image to grayscale (using standard luminance weights), then applies dithering, then expands back to the original number of channels. All color information is lost, but the dithering pattern is consistent across channels. Default: "grayscale"
error_diffusion_algorithm	One of: 'floyd_steinberg' 'jarvis' 'stucki' 'atkinson' 'burkes' 'sierra' 'sierra_2row' 'sierra_lite'	floyd_steinberg	Used only in "error_diffusion" method. Which specific algorithm: - "floyd_steinberg": The classic, invented in 1976. Spreads error to 4 neighbors. Good balance of quality and speed. Industry standard. - "jarvis": Jarvis-Judice-Ninke algorithm. Spreads error to 12 neighbors. Higher quality but 3x slower than Floyd-Steinberg. - "stucki": Similar to Jarvis but with different weights. Also 12 neighbors. - "atkinson": Created by Bill Atkinson for original Macintosh. Only spreads 75% of error, creating lighter images with more contrast. - "burkes": Spreads to 7 neighbors. Faster than Jarvis, better than Floyd-Steinberg. - "sierra": Spreads to 10 neighbors. Good quality, moderate speed. - "sierra_2row": Simplified Sierra using only 2 rows. Faster. - "sierra_lite": Minimal Sierra using only 3 neighbors. Very fast. Default: "floyd_steinberg"
bayer_matrix_size	One of: 2 4 8 16	4	Used only in "ordered" method. The size of the repeating pattern (2, 4, 8, or 16). - 2x2: Very visible checkerboard pattern - 4x4: Standard, good balance - 8x8: Finer pattern, less visible - 16x16: Very fine pattern, almost noise-like Default: 4
serpentine	bool	False	Used only in "error_diffusion" method. Whether to process rows in alternating directions (left-to-right, then right-to-left). This can reduce visible "worm" artifacts that sometimes appear as diagonal lines. Slightly slower. Default: False
noise_range	tuple[float, float]	(-0.5, 0.5)	Used only in "random" method. How much random noise to add before quantization. Larger range = more variation in the dithering pattern. Range: (-1.0, 1.0). Default: (-0.5, 0.5)
p	float	0.5	Probability of applying this transform. Default: 0.5

Examples

>>> import numpy as np
>>> import albumentations as A
>>> # Prepare sample data
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>>
>>> # Black and white dithering with Floyd-Steinberg
>>> transform = A.Compose([
...     A.Dithering(
...         method="error_diffusion",
...         n_colors=2,
...         error_diffusion_algorithm="floyd_steinberg",
...         color_mode="grayscale",
...         p=1.0
...     )
... ])
>>> transformed = transform(image=image)
>>> dithered_image = transformed['image']  # Black and white dithered image
>>>
>>> # Ordered dithering with 16 colors per channel
>>> transform = A.Compose([
...     A.Dithering(
...         method="ordered",
...         n_colors=16,
...         bayer_matrix_size=8,
...         color_mode="per_channel",
...         p=1.0
...     )
... ])
>>> transformed = transform(image=image)
>>> dithered_image = transformed['image']  # Reduced color depth with Bayer pattern
>>>
>>> # Random dithering
>>> transform = A.Compose([
...     A.Dithering(
...         method="random",
...         n_colors=4,
...         noise_range=(-0.3, 0.3),
...         p=1.0
...     )
... ])
>>> transformed = transform(image=image)
>>> dithered_image = transformed['image']  # Noisy dithered appearance

References

[{'description': 'Wikipedia', 'source': 'https://en.wikipedia.org/wiki/Dither'}, {'description': 'Floyd-Steinberg dithering', 'source': 'https://en.wikipedia.org/wiki/Floyd%E2%80%93Steinberg_dithering'}, {'description': 'Ordered dithering', 'source': 'https://en.wikipedia.org/wiki/Ordered_dithering'}, {'description': 'Error diffusion dithering', 'source': 'https://en.wikipedia.org/wiki/Error_diffusion'}]

Embossclass

Emboss(
    alpha: tuple[float, float] = (0.2, 0.5),
    strength: tuple[float, float] = (0.2, 0.7),
    p: float = 0.5
)

Apply emboss effect (directional highlight and shadow). strength_range controls intensity. Pseudo-3D look; for texture or style augmentation. This transform creates an emboss effect by highlighting edges and creating a 3D-like texture in the image. It works by applying a specific convolution kernel to the image that emphasizes differences in adjacent pixel values.

Parameters

Name	Type	Default	Description
alpha	tuple[float, float]	(0.2, 0.5)	Range to choose the visibility of the embossed image. At 0, only the original image is visible, at 1.0 only its embossed version is visible. Values should be in the range [0, 1]. Alpha will be randomly selected from this range for each image. Default: (0.2, 0.5)
strength	tuple[float, float]	(0.2, 0.7)	Range to choose the strength of the embossing effect. Higher values create a more pronounced 3D effect. Values should be non-negative. Strength will be randomly selected from this range for each image. Default: (0.2, 0.7)
p	float	0.5	Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.Emboss(alpha=(0.2, 0.5), strength=(0.2, 0.7), p=0.5)
>>> result = transform(image=image)
>>> embossed_image = result['image']

Notes

- The emboss effect is created using a 3x3 convolution kernel. - The 'alpha' parameter controls the blend between the original image and the embossed version. A higher alpha value will result in a more pronounced emboss effect. - The 'strength' parameter affects the intensity of the embossing. Higher strength values will create more contrast in the embossed areas, resulting in a stronger 3D-like effect. - This transform can be useful for creating artistic effects or for data augmentation in tasks where edge information is important.

References

[{'description': 'Image Embossing', 'source': 'https://en.wikipedia.org/wiki/Image_embossing'}, {'description': 'Application of Emboss Filtering in Image Processing', 'source': 'https://www.researchgate.net/publication/303412455_Application_of_Emboss_Filtering_in_Image_Processing'}]

Halftoneclass

Halftone(
    dot_size_range: tuple[int, int] = (4, 10),
    blend_range: tuple[float, float] = (0.0, 0.5),
    p: float = 0.5
)

Halftone dot pattern (printing-style). Continuous tones become dots of varying size. Use for vintage or print-aesthetic augmentation. Simulates halftone printing: a grid of cells, each drawn as a filled circle whose size is proportional to mean luminance in that cell. Larger dots = brighter, smaller = darker. Optional blend with the original image controls strength.

Parameters

Name	Type	Default	Description
dot_size_range	tuple[int, int]	(4, 10)	Range for grid cell size in pixels. Larger = coarser pattern. Default: (4, 10).
blend_range	tuple[float, float]	(0.0, 0.5)	Blend with original: 0 = pure halftone, 1 = original. Default: (0.0, 0.5).
p	float	0.5	Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>>
>>> transform = A.Halftone(dot_size_range=(4, 8), blend_range=(0.0, 0.3), p=1.0)
>>> result = transform(image=image)["image"]

Notes

- Mean luminance per grid cell drives dot radius; cell color from original image. - Dot size is proportional to luminance (bright → large dot, dark → small dot).

InvertImgclass

InvertImg(
    p: float = 0.5
)

Invert the input image by subtracting pixel values from max values of the image types, i.e., 255 for uint8 and 1.0 for float32.

Parameters

Name	Type	Default	Description
p	float	0.5	Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> import cv2
>>>
>>> # Create a sample image with different elements
>>> image = np.zeros((100, 100, 3), dtype=np.uint8)
>>> cv2.circle(image, (30, 30), 20, (255, 255, 255), -1)  # White circle
>>> cv2.rectangle(image, (60, 60), (90, 90), (128, 128, 128), -1)  # Gray rectangle
>>>
>>> # Apply InvertImg transform
>>> transform = A.InvertImg(p=1.0)
>>> result = transform(image=image)
>>> inverted_image = result['image']
>>>
>>> # Result:
>>> # - Black background becomes white (0 → 255)
>>> # - White circle becomes black (255 → 0)
>>> # - Gray rectangle is inverted (128 → 127)
>>> # The same approach works for float32 images (0-1 range) and grayscale images

LensFlareclass

LensFlare(
    flare_roi: tuple[float, float, float, float] = (0, 0, 1, 0.5),
    num_ghosts_range: tuple[int, int] = (3, 7),
    intensity_range: tuple[float, float] = (0.3, 0.7),
    num_rays_range: tuple[int, int] = (4, 8),
    bloom_range: tuple[float, float] = (0.01, 0.05),
    p: float = 0.5
)

Add lens flare: starburst rays and ghost reflections from a bright source. Use for outdoor or backlit robustness and optical-artifact simulation. A flare center is chosen in a configurable region; starburst rays and mirrored ghost circles are drawn toward the image center. Strength and blur are configurable.

Parameters

Name	Type	Default	Description
flare_roi	tuple[float, float, float, float]	(0, 0, 1, 0.5)	Region of interest for flare source placement as (x_min, y_min, x_max, y_max) in normalized [0, 1] coords. Default: (0, 0, 1, 0.5).
num_ghosts_range	tuple[int, int]	(3, 7)	Range for number of ghost reflections. Default: (3, 7).
intensity_range	tuple[float, float]	(0.3, 0.7)	Range for overall flare brightness. Default: (0.3, 0.7).
num_rays_range	tuple[int, int]	(4, 8)	Range for number of starburst rays. Default: (4, 8).
bloom_range	tuple[float, float]	(0.01, 0.05)	Range for bloom blur radius as fraction of image diagonal. Default: (0.01, 0.05).
p	float	0.5	Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> transform = A.LensFlare(intensity_range=(0.3, 0.6), p=1.0)
>>> result = transform(image=image)["image"]

Notes

- Ghost reflections lie along the line from flare source to image center. - Size decreases and color shifts with distance from the source.

Normalizeclass

Normalize(
    mean: tuple[float, ...] | float | None = (0.485, 0.456, 0.406),
    std: tuple[float, ...] | float | None = (0.229, 0.224, 0.225),
    max_pixel_value: float | None = 255.0,
    normalization: 'standard' | 'image' | 'image_per_channel' | 'min_max' | 'min_max_per_channel' = standard,
    p: float = 1.0
)

Applies various normalization techniques to an image. The specific normalization technique can be selected with the `normalization` parameter. Standard normalization is applied using the formula: `img = (img - mean * max_pixel_value) / (std * max_pixel_value)`. Other normalization techniques adjust the image based on global or per-channel statistics, or scale pixel values to a specified range.

Parameters

Name	Type	Default	Description
mean	One of: tuple[float, ...] float None	(0.485, 0.456, 0.406)	Mean values for standard normalization. For "standard" normalization, the default values are ImageNet mean values: (0.485, 0.456, 0.406).
std	One of: tuple[float, ...] float None	(0.229, 0.224, 0.225)	Standard deviation values for standard normalization. For "standard" normalization, the default values are ImageNet standard deviation :(0.229, 0.224, 0.225).
max_pixel_value	One of: float None	255.0	Maximum possible pixel value, used for scaling in standard normalization. Defaults to 255.0.
normalization	One of: 'standard' 'image' 'image_per_channel' 'min_max' 'min_max_per_channel'	standard	Specifies the normalization technique to apply. Defaults to "standard". - "standard": Applies the formula `(img - mean * max_pixel_value) / (std * max_pixel_value)`. The default mean and std are based on ImageNet. You can use mean and std values of (0.5, 0.5, 0.5) for inception normalization. And mean values of (0, 0, 0) and std values of (1, 1, 1) for YOLO. - "image": Normalizes the whole image based on its global mean and standard deviation. - "image_per_channel": Normalizes the image per channel based on each channel's mean and standard deviation. - "min_max": Scales the image pixel values to a [0, 1] range based on the global minimum and maximum pixel values. - "min_max_per_channel": Scales each channel of the image pixel values to a [0, 1] range based on the per-channel minimum and maximum pixel values.
p	float	1.0	Probability of applying the transform. Defaults to 1.0.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> # Standard ImageNet normalization
>>> transform = A.Normalize(
...     mean=(0.485, 0.456, 0.406),
...     std=(0.229, 0.224, 0.225),
...     max_pixel_value=255.0,
...     p=1.0
... )
>>> normalized_image = transform(image=image)["image"]
>>>
>>> # Min-max normalization
>>> transform_minmax = A.Normalize(normalization="min_max", p=1.0)
>>> normalized_image_minmax = transform_minmax(image=image)["image"]

Notes

- For "standard" normalization, `mean`, `std`, and `max_pixel_value` must be provided. - For other normalization types, these parameters are ignored. - For inception normalization, use mean values of (0.5, 0.5, 0.5). - For YOLO normalization, use mean values of (0, 0, 0) and std values of (1, 1, 1). - This transform is often used as a final step in image preprocessing pipelines to prepare images for neural network input.

References

[{'description': 'ImageNet mean and std', 'source': 'https://pytorch.org/vision/stable/models.html'}, {'description': 'Inception preprocessing', 'source': 'https://keras.io/api/applications/inceptionv3/'}]

RingingOvershootclass

RingingOvershoot(
    blur_limit: tuple[int, int] | int = (7, 15),
    cutoff: tuple[float, float] = (0.7853981633974483, 1.5707963267948966),
    p: float = 0.5
)

Create ringing or overshoot artifacts via 2D sinc convolution. blur_limit and cutoff control strength. Simulates sharpening or compression artifacts. This transform simulates the ringing artifacts that can occur in digital image processing, particularly after sharpening or edge enhancement operations. It creates oscillations or overshoots near sharp transitions in the image.

Parameters

Name	Type	Default	Description
blur_limit	One of: tuple[int, int] int	(7, 15)	Maximum kernel size for the sinc filter. Must be an odd number in the range [3, inf). If a single int is provided, the kernel size will be randomly chosen from the range (3, blur_limit). If a tuple (min, max) is provided, the kernel size will be randomly chosen from the range (min, max). Default: (7, 15).
cutoff	tuple[float, float]	(0.7853981633974483, 1.5707963267948966)	Range to choose the cutoff frequency in radians. Values should be in the range (0, π). A lower cutoff frequency will result in more pronounced ringing effects. Default: (π/4, π/2).
p	float	0.5	Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)

# Apply ringing effect with default parameters
>>> transform = A.RingingOvershoot(p=1.0)
>>> ringing_image = transform(image=image)['image']

# Apply ringing effect with custom parameters
>>> transform = A.RingingOvershoot(
...     blur_limit=(9, 17),
...     cutoff=(np.pi/6, np.pi/3),
...     p=1.0
... )
>>> ringing_image = transform(image=image)['image']

Notes

- Ringing artifacts are oscillations of the image intensity function in the neighborhood of sharp transitions, such as edges or object boundaries. - This transform uses a 2D sinc filter (also known as a 2D cardinal sine function) to introduce these artifacts. - The severity of the ringing effect is controlled by both the kernel size (blur_limit) and the cutoff frequency. - Larger kernel sizes and lower cutoff frequencies will generally produce more noticeable ringing effects. - This transform can be useful for: * Simulating imperfections in image processing or transmission systems * Testing the robustness of computer vision models to ringing artifacts * Creating artistic effects that emphasize edges and transitions in images

References

[{'description': 'Ringing artifacts', 'source': 'https://en.wikipedia.org/wiki/Ringing_artifacts'}, {'description': 'Sinc filter', 'source': 'https://en.wikipedia.org/wiki/Sinc_filter'}, {'description': 'Digital Image Processing', 'source': 'Rafael C. Gonzalez and Richard E. Woods, 4th Edition'}]

Sharpenclass

Sharpen(
    alpha: tuple[float, float] = (0.2, 0.5),
    lightness: tuple[float, float] = (0.5, 1.0),
    method: 'kernel' | 'gaussian' = kernel,
    kernel_size: int = 5,
    sigma: float = 1.0,
    p: float = 0.5
)

Sharpen the image via kernel or Gaussian unsharp method. alpha and lightness control strength. Enhances edges; useful for document or detail-sensitive tasks. Implements two different approaches to image sharpening: 1. Traditional kernel-based method using Laplacian operator 2. Gaussian interpolation method (similar to Kornia's approach)

Parameters

Name	Type	Default	Description
alpha	tuple[float, float]	(0.2, 0.5)	Range for the visibility of sharpening effect. At 0, only the original image is visible, at 1.0 only its processed version is visible. Values should be in the range [0, 1]. Used in both methods. Default: (0.2, 0.5).
lightness	tuple[float, float]	(0.5, 1.0)	Range for the lightness of the sharpened image. Only used in 'kernel' method. Larger values create higher contrast. Values should be greater than 0. Default: (0.5, 1.0).
method	One of: 'kernel' 'gaussian'	kernel	Sharpening algorithm to use: - 'kernel': Traditional kernel-based sharpening using Laplacian operator - 'gaussian': Interpolation between Gaussian blurred and original image Default: 'kernel'
kernel_size	int	5	Size of the Gaussian blur kernel for 'gaussian' method. Must be odd. Default: 5
sigma	float	1.0	Standard deviation for Gaussian kernel in 'gaussian' method. Default: 1.0
p	float	0.5	Probability of applying the transform. Default: 0.5.

Examples

>>> import albumentations as A
>>> import numpy as np

# Traditional kernel sharpening
>>> transform = A.Sharpen(
...     alpha=(0.2, 0.5),
...     lightness=(0.5, 1.0),
...     method='kernel',
...     p=1.0
... )

# Gaussian interpolation sharpening
>>> transform = A.Sharpen(
...     alpha=(0.5, 1.0),
...     method='gaussian',
...     kernel_size=5,
...     sigma=1.0,
...     p=1.0
... )

Notes

- Kernel sizes must be odd to maintain spatial alignment - Methods produce different visual results: * Kernel method: More pronounced edges, possible artifacts * Gaussian method: More natural look, limited to original sharpness

References

[{'description': 'R. C. Gonzalez and R. E. Woods, "Digital Image Processing (4th Edition),"', 'source': 'Chapter 3: Intensity Transformations and Spatial Filtering.'}, {'description': 'J. C. Russ, "The Image Processing Handbook (7th Edition),"', 'source': 'Chapter 4: Image Enhancement.'}, {'description': 'T. Acharya and A. K. Ray, "Image Processing', 'source': 'Principles and Applications,": Chapter 5: Image Enhancement.'}, {'description': 'Unsharp masking', 'source': 'https://en.wikipedia.org/wiki/Unsharp_masking'}, {'description': 'Laplacian operator', 'source': 'https://en.wikipedia.org/wiki/Laplace_operator'}, {'description': 'Gaussian blur', 'source': 'https://en.wikipedia.org/wiki/Gaussian_blur'}]

Superpixelsclass

Superpixels(
    p_replace: tuple[float, float] | float = (0, 0.1),
    n_segments: tuple[int, int] | int = (100, 100),
    max_size: int | None = 128,
    interpolation: 0 | 6 | 1 | 2 | 3 | 4 | 5 = 1,
    p: float = 0.5
)

Replace image with superpixel segmentation (SLIC). p_replace, n_segments, max_size control fraction and segment count. Reduces fine texture.

Parameters

Name	Type	Default	Description
p_replace	One of: tuple[float, float] float	(0, 0.1)	Defines for any segment the probability that the pixels within that segment are replaced by their average color (otherwise, the pixels are not changed). * A probability of `0.0` would mean, that the pixels in no segment are replaced by their average color (image is not changed at all). * A probability of `0.5` would mean, that around half of all segments are replaced by their average color. * A probability of `1.0` would mean, that all segments are replaced by their average color (resulting in a voronoi image). Behavior based on chosen data types for this parameter: * If a `float`, then that `float` will always be used. * If `tuple` `(a, b)`, then a random probability will be sampled from the interval `[a, b]` per image. Default: (0.1, 0.3)
n_segments	One of: tuple[int, int] int	(100, 100)	Rough target number of how many superpixels to generate. The algorithm may deviate from this number. Lower value will lead to coarser superpixels. Higher values are computationally more intensive and will hence lead to a slowdown. If tuple `(a, b)`, then a value from the discrete interval `[a..b]` will be sampled per image. Default: (15, 120)
max_size	One of: int None	128	Maximum image size at which the augmentation is performed. If the width or height of an image exceeds this value, it will be downscaled before the augmentation so that the longest side matches `max_size`. This is done to speed up the process. The final output image has the same size as the input image. Note that in case `p_replace` is below `1.0`, the down-/upscaling will affect the not-replaced pixels too. Use `None` to apply no down-/upscaling. Default: 128
interpolation	One of: 0 6 1 2 3 4 5	1	Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
p	float	0.5	Probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)

# Apply superpixels with default parameters
>>> transform = A.Superpixels(p=1.0)
>>> augmented_image = transform(image=image)['image']

# Apply superpixels with custom parameters
>>> transform = A.Superpixels(
...     p_replace=(0.5, 0.7),
...     n_segments=(50, 100),
...     max_size=None,
...     interpolation=cv2.INTER_NEAREST,
...     p=1.0
... )
>>> augmented_image = transform(image=image)['image']

Notes

- This transform can significantly change the visual appearance of the image. - The transform makes use of a superpixel algorithm, which tends to be slow. If performance is a concern, consider using `max_size` to limit the image size. - The effect of this transform can vary greatly depending on the `p_replace` and `n_segments` parameters. - When `p_replace` is high, the image can become highly abstracted, resembling a voronoi diagram. - The transform preserves the original image type (uint8 or float32).

UnsharpMaskclass

UnsharpMask(
    blur_limit: tuple[int, int] | int = (3, 7),
    sigma_limit: tuple[float, float] | float = 0.0,
    alpha: tuple[float, float] | float = (0.2, 0.5),
    threshold: int = 10,
    p: float = 0.5
)

Sharpen via unsharp masking: blur, subtract, add back. blur_limit, sigma_limit, alpha control strength. Luminance unchanged; edges enhanced. Unsharp masking is a technique that enhances edge contrast in an image, creating the illusion of increased sharpness. This transform applies Gaussian blur to create a blurred version of the image, then uses this to create a mask which is combined with the original image to enhance edges and fine details.

Parameters

Name	Type	Default	Description
blur_limit	One of: tuple[int, int] int	(3, 7)	maximum Gaussian kernel size for blurring the input image. Must be zero or odd and in range [0, inf). If set to 0 it will be computed from sigma as `round(sigma * (3 if img.dtype == np.uint8 else 4) * 2 + 1) + 1`. If set single value `blur_limit` will be in range (0, blur_limit). Default: (3, 7).
sigma_limit	One of: tuple[float, float] float	0.0	Gaussian kernel standard deviation. Must be more or equal to 0. If set single value `sigma_limit` will be in range (0, sigma_limit). If set to 0 sigma will be computed as `sigma = 0.3((ksize-1)0.5 - 1) + 0.8`. Default: 0.
alpha	One of: tuple[float, float] float	(0.2, 0.5)	range to choose the visibility of the sharpened image. At 0, only the original image is visible, at 1.0 only its sharpened version is visible. Default: (0.2, 0.5).
threshold	int	10	Value to limit sharpening only for areas with high pixel difference between original image and it's smoothed version. Higher threshold means less sharpening on flat areas. Must be in range [0, 255]. Default: 10.
p	float	0.5	probability of applying the transform. Default: 0.5.

Examples

>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>>
# Apply UnsharpMask with default parameters
>>> transform = A.UnsharpMask(p=1.0)
>>> sharpened_image = transform(image=image)['image']
>>>
# Apply UnsharpMask with custom parameters
>>> transform = A.UnsharpMask(
...     blur_limit=(3, 7),
...     sigma_limit=(0.1, 0.5),
...     alpha=(0.2, 0.7),
...     threshold=15,
...     p=1.0
... )
>>> sharpened_image = transform(image=image)['image']

Notes

- The algorithm creates a mask M = (I - G) * alpha, where I is the original image and G is the Gaussian blurred version. - The final image is computed as: output = I + M if |I - G| > threshold, else I. - Higher alpha values increase the strength of the sharpening effect. - Higher threshold values limit the sharpening effect to areas with more significant edges or details. - The blur_limit and sigma_limit parameters control the Gaussian blur used to create the mask.

References

[{'description': 'Unsharp Masking', 'source': 'https://en.wikipedia.org/wiki/Unsharp_masking'}]