Stay updated

News & Insights
Lib Comparison
FAQAlbumentations License Guide
API Reference

Transform Library Comparison Guide

This guide helps you find equivalent transforms between Albumentations and other popular libraries (torchvision and Kornia).

Key Differences

Compared to TorchVision

  • Albumentations operates on numpy arrays (TorchVision uses PyTorch tensors)
  • More parameters for fine-tuning transformations
  • Built-in support for mask augmentation
  • Better handling of bounding boxes and keypoints

Compared to Kornia

  • CPU-based numpy operations (Kornia uses GPU tensors)
  • More comprehensive support for detection/segmentation
  • Generally better CPU performance
  • Simpler API for common tasks

Common Transform Mappings

Basic Geometric Transforms

TorchVision TransformAlbumentations EquivalentNotes
ResizeResize / LongestMaxSize- TorchVision's Resize combines two Albumentations behaviors:
  1. When given (h,w): equivalent to Albumentations Resize
  2. When given single int + max_size: similar to LongestMaxSize
- Albumentations allows separate interpolation method for masks
- TorchVision has antialias parameter, Albumentations doesn't
ScaleJitterOneOf + multiple Resize- Can be approximated in Albumentations using OneOf container with multiple Resize transforms
- Example:
  transforms = A.OneOf([
    A.Resize(height=int(target_h * scale), width=int(target_w * scale))
    for scale in np.linspace(0.1, 2.0, num=20)
  ])
- Not exactly the same as continuous random scaling, but provides similar functionality
RandomShortestSizeOneOf + SmallestMaxSize- Can be approximated in Albumentations using:
  transforms = A.OneOf([
    A.SmallestMaxSize(max_size=size, max_height=max_size, max_width=max_size)
    for size in [480, 512, 544, 576, 608]
  ])
- Randomly selects size for shortest side while maintaining aspect ratio
- Optional max_size parameter limits longest side
- TorchVision has antialias parameter, Albumentations doesn't
RandomResizeOneOf + Resize- TorchVision: randomly selects single size S between min_size and max_size, sets both width and height to S
- No direct equivalent in Albumentations (RandomScale preserves aspect ratio)
- Can be approximated using:
  transforms = A.OneOf([
    A.Resize(size, size)
    for size in range(min_size, max_size + 1, step)
  ])
RandomCropRandomCrop- Both perform random cropping with similar core functionality
- Key differences:
  1. TorchVision accepts single int for square crop, Albumentations requires both height and width
  2. Padding options differ:
    - TorchVision: supports padding parameter for pre-padding
    - Albumentations: offers pad_position parameter ('center', 'top_left', etc.)
  3. Fill value handling:
    - TorchVision: supports dict mapping for different types
    - Albumentations: separate fill and fill_mask parameters
  4. Padding modes:
    - TorchVision: 'constant', 'edge', 'reflect', 'symmetric'
    - Albumentations: uses OpenCV border modes
RandomResizedCropRandomResizedCrop- Nearly identical functionality and parameters
- Key differences:
  1. TorchVision accepts single int for square output, Albumentations requires (height, width) tuple
  2. Default values are the same (scale=(0.08, 1.0), ratio=(0.75, 1.3333))
  3. Albumentations adds:
    - Separate mask_interpolation parameter
    - Probability parameter p
RandomIoUCropRandomSizedBBoxSafeCrop- Both ensure safe cropping with respect to bounding boxes
- Key differences:
  1. TorchVision:
    - Implements exact SSD paper approach
    - Uses IoU-based sampling strategy
    - Requires explicit sanitization of boxes after crop
  2. Albumentations:
    - Simpler approach ensuring bbox safety
    - Directly specifies target size
    - Automatically handles bbox cleanup
- For exact SSD-style cropping, might need custom implementation in Albumentations
CenterCropCenterCrop- Both crop the center part of the input
- Key differences:
  1. Size specification:
    - TorchVision: accepts single int for square crop or (height, width) tuple
    - Albumentations: requires separate height and width parameters
  2. Padding behavior:
    - TorchVision: always pads with 0 if image is smaller
    - Albumentations: optional padding with pad_if_needed
  3. Albumentations adds:
    - Configurable padding mode and position
    - Separate fill values for image and mask
    - Probability parameter p
RandomHorizontalFlipHorizontalFlip- Identical functionality
- Both have default probability p=0.5
- Only naming difference: TorchVision includes "Random" in name
RandomVerticalFlipVerticalFlip- Identical functionality
- Both have default probability p=0.5
- Only naming difference: TorchVision includes "Random" in name
PadPad- Similar core padding functionality
- Both support:
  - Single int for all sides
  - (pad_x, pad_y) for symmetric padding
  - (left, top, right, bottom) for per-side padding
- Key differences:
  1. Padding modes:
    - TorchVision: 'constant', 'edge', 'reflect', 'symmetric'
    - Albumentations: uses OpenCV border modes
  2. Fill value handling:
    - TorchVision: supports dict mapping for different types
    - Albumentations: separate fill and fill_mask parameters
  3. Albumentations adds:
    - Probability parameter p
RandomZoomOutRandomScale + PadIfNeeded- No direct equivalent in Albumentations
- Can be approximated by combining:
  A.Compose([
    A.RandomScale(scale_limit=(0.0, 3.0), p=0.5), # scale_limit=(0.0, 3.0) maps to side_range=(1.0, 4.0)
    A.PadIfNeeded(min_height=height, min_width=width, border_mode=cv2.BORDER_CONSTANT, value=fill)
  ])
- Key differences:
  1. TorchVision implements specific SSD paper approach
  2. Albumentations requires composition of two transforms
RandomRotationRotate- Similar core rotation functionality but with different parameters
- Key differences:
  1. Angle specification:
    - TorchVision: degrees parameter (-degrees, +degrees) or (min, max)
    - Albumentations: limit parameter (-limit, +limit) or (min, max)
  2. Output size control:
    - TorchVision: expand=True/False
    - Albumentations: crop_border=True/False
  3. Additional Albumentations features:
    - Separate mask interpolation
    - Bbox rotation methods ('largest_box' or 'ellipse')
    - More border modes
    - Probability parameter p
  4. Center specification:
    - TorchVision: supports custom center point
    - Albumentations: always uses image center
RandomAffineAffine- Both support core affine operations (translation, rotation, scale, shear)
- Key differences:
  1. Parameter specification:
    - TorchVision: single parameters for each transform
    - Albumentations: more flexible with dict options for x/y axes
  2. Scale handling:
    - Albumentations adds keep_ratio and balanced_scale
    - Albumentations supports independent x/y scaling
  3. Translation:
    - TorchVision: fraction only
    - Albumentations: both percent and pixels
  4. Additional Albumentations features:
    - fit_output to adjust image plane
    - Separate mask interpolation
    - More border modes
    - Bbox rotation methods
    - Probability parameter p
RandomPerspectivePerspective- Both apply random perspective transformations
- Key differences:
  1. Distortion control:
    - TorchVision: single distortion_scale (0 to 1)
    - Albumentations: scale tuple for corner movement range
  2. Output handling:
    - Albumentations adds keep_size and fit_output options
    - Can control whether to maintain original size
  3. Additional Albumentations features:
    - Separate mask interpolation
    - More border modes
    - Better control over output size and fitting
ElasticTransformElasticTransform- Similar core functionality: both apply elastic deformations to images
- Key differences:
  1. Parameters have opposite meanings:
    - TorchVision: alpha (displacement), sigma (smoothness)
    - Albumentations: alpha (smoothness), sigma (displacement)
  2. Default values reflect this difference:
    - TorchVision: alpha=50.0, sigma=5.0
    - Albumentations: alpha=1.0, sigma=50.0
- Note on implementation:
  - Albumentations follows Simard et al. 2003 paper more closely:
    - σ should be ~0.05 * image_size
    - α should be proportional to σ
- Additional Albumentations features:
  - approximate mode
  - same_dxdy option
  - Choice of noise distribution
  - Separate mask interpolation
ColorJitterColorJitter- Similar core functionality: both randomly adjust brightness, contrast, saturation, and hue
- Key similarities:
  1. Same parameter names and meanings
  2. Same value ranges (e.g., hue should be in [-0.5, 0.5])
  3. Random order of transformations
- Key differences:
  1. Default values:
    - TorchVision: all None by default
    - Albumentations: defaults to (0.8, 1.2) for brightness/contrast/saturation
  2. Implementation:
    - TorchVision: uses Pillow
    - Albumentations: uses OpenCV (may produce slightly different results)
  3. Additional in Albumentations:
    - Explicit probability parameter p
    - Value saturation instead of uint8 overflow
RandomChannelPermutationChannelShuffle- Both randomly permute image channels
- Key similarities:
  1. Same core functionality
  2. Work on multi-channel images (typically RGB)
- Key differences:
  1. Naming convention only
  2. Albumentations adds:
    - Probability parameter p
RandomPhotometricDistortRandomOrder + ColorJitter + ChannelShuffle- TorchVision's transform is from SSD paper, combines:
  1. Color jittering (brightness, contrast, saturation, hue)
  2. Random channel permutation
- Can be replicated in Albumentations using:
  A.RandomOrder([
    A.ColorJitter(brightness=(0.875, 1.125),
                contrast=(0.5, 1.5),
                saturation=(0.5, 1.5),
                hue=(-0.05, 0.05),
                p=0.5),
    A.ChannelShuffle(p=0.5)
  ])
GrayscaleToGray- Similar core functionality: convert RGB to grayscale
- Key differences:
  1. Output channels:
    - TorchVision: only 1 or 3 channels
    - Albumentations: supports any number of output channels
  2. Conversion methods:
    - TorchVision: single method (weighted RGB)
    - Albumentations: multiple methods via method parameter:
      • weighted_average (default, same as TorchVision)
      • from_lab, desaturation, average, max, pca
  3. Additional in Albumentations:
    - Probability parameter p
    - More flexible channel handling
RGBToRGB- Similar core functionality: convert to RGB format
- Key differences:
  1. Input handling:
    - TorchVision: accepts 1 or 3 channel inputs
    - Albumentations: only accepts single-channel inputs
  2. Output channels:
    - TorchVision: always 3 channels
    - Albumentations: configurable via num_output_channels
  3. Behavior:
    - TorchVision: converts to RGB if not already RGB
    - Albumentations: strictly grayscale to RGB conversion
  4. Additional in Albumentations:
    - Probability parameter p
RandomGrayscaleToGray- Similar core functionality: convert to grayscale with probability
- Key differences:
  1. Default probability:
    - TorchVision: p=0.1
    - Albumentations: p=0.5
  2. Output handling:
    - TorchVision: always preserves input channels
    - Albumentations: configurable output channels
  3. Conversion methods:
    - TorchVision: single method
    - Albumentations: multiple methods with different channel support:
      • weighted_average, from_lab: 3-channel only
      • desaturation, average, max, pca: any number of channels
  4. Channel requirements:
    - TorchVision: works with 1 or 3 channels
    - Albumentations: depends on method chosen
GaussianBlurGaussianBlur- Similar core functionality: apply Gaussian blur with random kernel size
- Key similarities:
  1. Both support random kernel sizes
  2. Both support random sigma values
- Key differences:
  1. Parameter specification:
    - TorchVision: kernel_size (exact size), sigma (range)
    - Albumentations: blur_limit (size range), sigma_limit (range)
  2. Kernel size constraints:
    - TorchVision: must specify exact size
    - Albumentations: can specify range (3, 7) or auto-compute
  3. Additional in Albumentations:
    - Probability parameter p
    - Auto-computation of kernel size from sigma
GaussianNoiseGaussNoise- Similar core functionality: add Gaussian noise to images
- Key similarities:
  1. Both support mean and standard deviation parameters
- Key differences:
  1. Parameter ranges:
    - TorchVision: fixed values for mean and sigma
    - Albumentations: ranges for both (std_range, mean_range)
  2. Value handling:
    - TorchVision: expects float [0,1], has clip option
    - Albumentations: auto-scales based on dtype
  3. Additional in Albumentations:
    - Per-channel noise option
    - Noise scale factor for performance
    - Probability parameter p
RandomInvertInvertImg- Similar core functionality: invert image colors
- Key similarities:
  1. Both invert pixel values
  2. Both have default probability of 0.5
- Key differences:
  1. Value handling:
    - TorchVision: works with [0,1] float tensors
    - Albumentations: auto-handles uint8 (255) and float32 (1.0)
RandomPosterizePosterize- Similar core functionality: reduce color bits
- Key similarities:
  1. Both posterize images with probability p=0.5
- Key differences:
  1. Bits specification:
    - TorchVision: single fixed value [0-8]
    - Albumentations: flexible options with [1-7] (recommended):
      • Single value for all channels
      • Range (min_bits, max_bits)
      • Per-channel values [r,g,b]
      • Per-channel ranges [(r_min,r_max), ...]
  2. Practical range:
    - TorchVision: includes 0 (black) and 8 (unchanged)
    - Albumentations: recommended [1-7] for actual posterization
RandomSolarizeSolarize- Similar core functionality: invert pixels above threshold
- Key similarities:
  1. Both have default probability p=0.5
  2. Both invert values above threshold
- Key differences:
  1. Threshold specification:
    - TorchVision: single fixed threshold value
    - Albumentations: range via threshold_range
  2. Value handling:
    - TorchVision: works with raw threshold values
    - Albumentations: uses normalized [0,1] range:
      • uint8: multiplied by 255
      • float32: multiplied by 1.0
RandomAdjustSharpnessSharpen- Similar core functionality: adjust image sharpness
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Parameter specification:
    - TorchVision: single sharpness_factor
      • 0: blurred
      • 1: original
      • 2: doubled sharpness
    - Albumentations: more controls:
      • alpha: effect visibility [0,1]
      • lightness: contrast control
  2. Method options:
    - TorchVision: single method
    - Albumentations: two methods:
      • 'kernel': Laplacian operator
      • 'gaussian': blur interpolation
RandomAutocontrastAutoContrastSame core functionality with identical parameters (p=0.5)
RandomEqualizeEqualize- Similar core functionality: histogram equalization
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Additional Albumentations features:
    - Choice of algorithm (cv/pil methods)
    - Per-channel or luminance-based equalization
    - Optional masking support
NormalizeNormalize- Similar core functionality: normalize image values
- Key similarities:
  1. Both support mean/std normalization
- Key differences:
  1. Normalization options:
    - TorchVision: only (input - mean) / std
    - Albumentations: multiple methods:
      • standard (same as TorchVision)
      • image (global stats)
      • image_per_channel
      • min_max
      • min_max_per_channel
  2. Additional in Albumentations:
    - max_pixel_value parameter
    - Probability parameter p
RandomErasingErasing- Similar core functionality: randomly erase image regions
- Key similarities:
  1. Both have default probability p=0.5
  2. Same default scale=(0.02, 0.33)
  3. Same default ratio=(0.3, 3.3)
- Key differences:
  1. Fill value options:
    - TorchVision: number/tuple or 'random'
    - Albumentations: additional options:
      • random_uniform
      • inpaint_telea
      • inpaint_ns
  2. Additional in Albumentations:
    - Mask fill value option
    - Support for masks, bboxes, keypoints
JPEGImageCompression- Similar core functionality: apply JPEG compression
- Key similarities:
  1. Both use quality range 1-100
  2. Both support quality ranges
- Key differences:
  1. Compression types:
    - TorchVision: JPEG only
    - Albumentations: JPEG and WebP
  2. Additional in Albumentations:
    - Probability parameter p
    - Default quality range (99, 100)

Kornia to Albumentations

KorniaAlbumentationsNotes
ColorJitterColorJitter- Similar core functionality: randomly adjust brightness, contrast, saturation, and hue
- Key similarities:
  1. Both support same parameters (brightness, contrast, saturation, hue)
  2. Both allow float or tuple ranges for parameters
- Key differences:
  1. Default values:
    - Albumentations: (0.8, 1.2) for brightness/contrast/saturation
    - Kornia: 0.0 for all parameters
  2. Default probability:
    - Albumentations: p=0.5
    - Kornia: p=1.0
  3. Note: Kornia recommends using ColorJiggle instead as it follows color theory better
RandomAutoContrastAutoContrast- Similar core functionality: enhance image contrast automatically
- Key similarities:
  1. Both stretch intensity range to use full range
  2. Both preserve relative intensities
- Key differences:
  1. Default probability:
    - Albumentations: p=0.5
    - Kornia: p=1.0
  2. Additional in Kornia:
    - clip_output parameter to control value clipping
RandomBoxBlurBlur- Similar core functionality: apply box/average blur to images
- Key similarities:
  1. Both have default probability p=0.5
  2. Both apply box/average blur filter
- Key differences:
  1. Kernel size specification:
    - Albumentations: blur_limit parameter for range (e.g., (3, 7))
    - Kornia: fixed kernel_size tuple (default (3, 3))
  2. Additional in Kornia:
    - border_type parameter ('reflect', 'replicate', 'circular')
    - normalized parameter for L1 norm control
RandomBrightnessRandomBrightnessContrast- Different scope:
  - Kornia: brightness only
  - Albumentations: combines brightness and contrast
- Key differences:
  1. Parameter specification:
    - Kornia: brightness tuple (default: (1.0, 1.0))
    - Albumentations: brightness_limit (default: (-0.2, 0.2))
  2. Default probability:
    - Kornia: p=1.0
    - Albumentations: p=0.5
  3. Additional in Albumentations:
    - brightness_by_max parameter for adjustment method
    - ensure_safe_range to prevent overflow/underflow
    - Combined contrast control
  4. Additional in Kornia:
    - clip_output parameter
RandomChannelDropoutChannelDropout- Similar core functionality: randomly drop image channels
- Key similarities:
  1. Both have default probability p=0.5
  2. Both allow specifying fill value for dropped channels
- Key differences:
  1. Channel drop specification:
    - Kornia: fixed num_drop_channels (default: 1)
    - Albumentations: flexible channel_drop_range tuple (default: (1, 1))
  2. Error handling:
    - Albumentations: explicit checks for single-channel images and invalid ranges
    - Kornia: simpler parameter validation
RandomChannelShuffleChannelShuffle- Identical core functionality: randomly shuffle image channels
RandomClaheCLAHE- Similar core functionality: apply Contrast Limited Adaptive Histogram Equalization
- Key similarities:
  1. Both have default probability p=0.5
  2. Both allow configuring grid size and clip limit
- Key differences:
  1. Parameter defaults:
    - Kornia: clip_limit=(40.0, 40.0), grid_size=(8, 8)
    - Albumentations: clip_limit=(1, 4), tile_grid_size=(8, 8)
  2. Additional in Kornia:
    - slow_and_differentiable parameter for implementation choice
RandomContrastRandomBrightnessContrast- Different scope:
  - Kornia: contrast only
  - Albumentations: combines brightness and contrast
- Key differences:
  1. Parameter specification:
    - Kornia: contrast tuple (default: (1.0, 1.0))
    - Albumentations: contrast_limit (default: (-0.2, 0.2))
  2. Default probability:
    - Kornia: p=1.0
    - Albumentations: p=0.5
  3. Additional in Albumentations:
    - ensure_safe_range to prevent overflow/underflow
    - Combined brightness control
  4. Additional in Kornia:
    - clip_output parameter
RandomEqualizeEqualize- Similar core functionality: apply histogram equalization
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Additional in Albumentations:
    - mode parameter to choose between 'cv' and 'pil' methods
    - by_channels parameter for per-channel or luminance-based equalization
    - mask parameter to selectively apply equalization
    - mask_params for dynamic mask generation
RandomGammaRandomGamma- Similar core functionality: apply random gamma correction
- Key differences:
  1. Parameter specification:
    - Kornia: separate gamma (1.0, 1.0) and gain (1.0, 1.0) tuples
    - Albumentations: single gamma_limit (80, 120) as percentage range
  2. Default probability:
    - Kornia: p=1.0
    - Albumentations: p=0.5
  3. Additional in Albumentations:
    - eps parameter to prevent numerical errors
RandomGaussianBlurGaussianBlur- Similar core functionality: apply Gaussian blur with random parameters
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support kernel size and sigma parameters
- Key differences:
  1. Parameter specification:
    - Kornia: requires explicit kernel_size and sigma range
    - Albumentations: blur_limit (default: (3, 7)) and sigma_limit (default: 0)
  2. Additional in Kornia:
    - border_type parameter for padding mode
    - separable parameter for 1D convolution optimization
RandomGaussianIlluminationIllumination- Similar core functionality: apply illumination effects
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support controlling effect intensity and position
- Key differences:
  1. Scope:
    - Kornia: Gaussian illumination patterns only
    - Albumentations: Multiple modes (linear, corner, gaussian)
  2. Parameter ranges:
    - Kornia: gain=(0.01, 0.15), center=(0.1, 0.9), sigma=(0.2, 1.0)
    - Albumentations: intensity_range=(0.01, 0.2), center_range=(0.1, 0.9), sigma_range=(0.2, 1.0)
  3. Additional in Albumentations:
    - mode parameter for different effect types
    - effect_type for brighten/darken control
    - angle_range for linear gradients
  4. Additional in Kornia:
    - sign parameter for effect direction
RandomGaussianNoiseGaussNoise- Similar core functionality: add Gaussian noise to images
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Parameter specification:
    - Kornia: fixed mean (default: 0.0) and std (default: 1.0)
    - Albumentations: ranges via std_range (0.2, 0.44) and mean_range (0.0, 0.0)
  2. Additional in Albumentations:
    - per_channel parameter for independent channel noise
    - noise_scale_factor for performance optimization
    - Automatic value scaling based on image dtype
RandomGrayscaleToGray- Similar core functionality: convert images to grayscale
- Key differences:
  1. Default probability:
    - Kornia: p=0.1
    - Albumentations: p=0.5
  2. Conversion options:
    - Kornia: customizable rgb_weights for channel mixing
    - Albumentations: multiple method options (weighted_average, from_lab, desaturation, average, max, pca)
  3. Output control:
    - Kornia: always 3-channel output
    - Albumentations: configurable num_output_channels
RandomHueColorJitter (hue parameter)- Similar core functionality: adjust image hue
- Key differences:
  1. Scope:
    - Kornia: hue-only transform
    - Albumentations: part of ColorJitter with brightness, contrast, and saturation
  2. Default values:
    - Kornia: hue=(0.0, 0.0), p=1.0
    - Albumentations: hue=(-0.5, 0.5), p=0.5
RandomInvertInvertImg- Similar core functionality: invert image values
- Key differences:
  1. Maximum value handling:
    - Kornia: configurable via max_val parameter (default: 1.0)
    - Albumentations: automatically determined by dtype (255 for uint8, 1.0 for float32)
RandomJPEGImageCompression- Similar core functionality: apply image compression
- Key differences:
  1. Compression options:
    - Kornia: JPEG only
    - Albumentations: supports both JPEG and WebP
  2. Quality specification:
    - Kornia: jpeg_quality (default: 50.0)
    - Albumentations: quality_range (default: (99, 100))
  3. Default probability:
    - Kornia: p=1.0
    - Albumentations: p=0.5
RandomLinearCornerIlluminationIllumination (corner mode)- Similar core functionality: apply corner illumination effects
- Key differences:
  1. Scope:
    - Kornia: corner illumination only
    - Albumentations: part of general Illumination transform with multiple modes
  2. Parameter specification:
    - Kornia: gain (0.01, 0.2) and sign (-1.0, 1.0)
    - Albumentations: intensity_range (0.01, 0.2) and effect_type (brighten/darken/both)
  3. Additional in Albumentations:
    - Multiple illumination modes (linear, corner, gaussian)
    - More control over effect parameters
RandomLinearIlluminationIllumination (linear mode)- Similar core functionality: apply linear illumination effects
- Key differences:
  1. Scope:
    - Kornia: linear illumination only
    - Albumentations: part of general Illumination transform with multiple modes
  2. Parameter specification:
    - Kornia: gain (0.01, 0.2) and sign (-1.0, 1.0)
    - Albumentations: intensity_range (0.01, 0.2), effect_type (brighten/darken/both), and angle_range (0, 360)
  3. Additional in Albumentations:
    - Multiple illumination modes (linear, corner, gaussian)
    - Explicit angle control for gradient direction
RandomMedianBlurMedianBlur- Similar core functionality: apply median blur filter
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Kernel size specification:
    - Kornia: fixed kernel_size tuple (default: (3, 3))
    - Albumentations: range via blur_limit (default: (3, 7))
  2. Kernel constraints:
    - Albumentations: enforces odd kernel sizes
RandomMotionBlurMotionBlur- Similar core functionality: apply directional motion blur
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support angle and direction control
- Key differences:
  1. Kernel size specification:
    - Kornia: kernel_size as int or tuple
    - Albumentations: blur_limit (default: (3, 7))
  2. Angle control:
    - Kornia: angle parameter with symmetric range (-angle, angle)
    - Albumentations: angle_range (default: (0, 360))
  3. Additional in Albumentations:
    - allow_shifted parameter for kernel position control
RandomPlanckianJitterPlanckianJitter- Similar core functionality: apply physics-based color temperature variations
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support 'blackbody' and 'cied' modes
- Key differences:
  1. Temperature control:
    - Kornia: select_from parameter for discrete jitter selection
    - Albumentations: temperature_limit for continuous range
  2. Additional in Albumentations:
    - sampling_method parameter ('uniform' or 'gaussian')
    - More detailed control over temperature ranges
    - Better documentation of physics-based effects
RandomPlasmaBrightnessPlasmaBrightnessContrast- Similar core functionality: apply fractal-based brightness adjustments
- Key similarities:
  1. Both have default probability p=0.5
  2. Both use Diamond-Square algorithm for pattern generation
- Key differences:
  1. Parameter specification:
    - Kornia: roughness (0.1, 0.7) and intensity (0.0, 1.0)
    - Albumentations: brightness_range (-0.3, 0.3), contrast_range (-0.3, 0.3), roughness (default: 3.0)
  2. Additional in Albumentations:
    - Combined brightness and contrast adjustment
    - plasma_size parameter for pattern detail control
    - More detailed mathematical formulation and documentation
RandomPlasmaContrastPlasmaBrightnessContrast- Similar core functionality: apply fractal-based contrast adjustments
- Key similarities:
  1. Both have default probability p=0.5
  2. Both use Diamond-Square algorithm for pattern generation
- Key differences:
  1. Parameter specification:
    - Kornia: roughness (0.1, 0.7) only
    - Albumentations: contrast_range (-0.3, 0.3), roughness (default: 3.0), plasma_size (default: 256)
  2. Scope:
    - Kornia: contrast-only adjustment
    - Albumentations: combined brightness and contrast adjustment
  3. Additional in Albumentations:
    - More detailed mathematical formulation
    - Pattern size control via plasma_size
RandomPlasmaShadowPlasmaShadow- Similar core functionality: apply fractal-based shadow effects
- Key similarities:
  1. Both have default probability p=0.5
  2. Both use Diamond-Square algorithm for pattern generation
- Key differences:
  1. Parameter specification:
    - Kornia: roughness (0.1, 0.7), shade_intensity (-1.0, 0.0), shade_quantity (0.0, 1.0)
    - Albumentations: shadow_intensity_range (0.3, 0.7), plasma_size (default: 256), roughness (default: 3.0)
  2. Additional in Albumentations:
    - Pattern size control via plasma_size
    - More intuitive intensity range (0 to 1)
    - More detailed mathematical formulation and documentation
RandomPosterizePosterize- Similar core functionality: reduce color bits in image
- Key similarities:
  1. Both have default probability p=0.5
  2. Both operate on color bit reduction
- Key differences:
  1. Bit specification:
    - Kornia: bits parameter (default: 3) with range (0, 8], can be float or tuple
    - Albumentations: num_bits parameter (default: 4) with range [1, 7], supports multiple formats:
      * Single int for all channels
      * Tuple for random range
      * List for per-channel specification
      * List of tuples for per-channel ranges
  2. Additional in Albumentations:
    - More flexible channel-wise control
    - More detailed documentation and mathematical background
RandomRainRandomRain- Similar core functionality: add rain effects to images
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Rain parameter specification:
    - Kornia: number_of_drops (1000, 2000), drop_height (5, 20), drop_width (-5, 5)
    - Albumentations: slant_range (-10, 10), drop_length (20), drop_width (1)
  2. Additional in Albumentations:
    - drop_color customization
    - blur_value for atmospheric effect
    - brightness_coefficient for lighting adjustment
    - rain_type presets (drizzle, heavy, torrential)
  3. Approach:
    - Kornia: Direct drop placement
    - Albumentations: More realistic simulation with slant, blur, and brightness effects
RandomRGBShiftAdditiveNoise- Similar core functionality: add noise/shifts to image channels
- Key similarities:
  1. Both have default probability p=0.5
  2. Both can affect individual channels
- Key differences:
  1. Approach:
    - Kornia: Simple RGB channel shifts with individual limits
    - Albumentations: More sophisticated noise generation with multiple distributions
  2. Parameter specification:
    - Kornia: r_shift_limit, g_shift_limit, b_shift_limit (all default: 0.5)
    - Albumentations: Flexible noise configuration with:
      * Multiple noise types (uniform, gaussian, laplace, beta)
      * Different spatial modes (constant, per_pixel, shared)
      * Customizable distribution parameters
  3. Additional in Albumentations:
    - Performance optimization options
    - More detailed control over noise distribution
    - Spatial application modes
RandomSaltAndPepperNoiseSaltAndPepper- Similar core functionality: apply salt and pepper noise to images
- Key similarities:
  1. Both have default probability p=0.5
  2. Both use same default parameters:
    - amount (0.01, 0.06)
    - salt_vs_pepper (0.4, 0.6)
- Key differences:
  1. Parameter flexibility:
    - Kornia: Supports single float or tuple for parameters
    - Albumentations: Requires tuples for ranges
  2. Documentation:
    - Albumentations provides:
      * Detailed mathematical formulation
      * Clear examples for different noise levels
      * Implementation notes and edge cases
      * References to academic sources
RandomSaturationColorJitter- Different scope and functionality:
- Key differences:
  1. Scope:
    - Kornia: Saturation-only adjustment
    - Albumentations: Combined brightness, contrast, saturation, and hue adjustment
  2. Default parameters:
    - Kornia: saturation (1.0, 1.0), p=1.0
    - Albumentations: saturation (0.8, 1.2), p=0.5
  3. Implementation:
    - Kornia: Aligns with PIL/TorchVision implementation
    - Albumentations: Uses OpenCV with noted differences in HSV conversion
  4. Additional in Albumentations:
    - Brightness adjustment
    - Contrast adjustment
    - Hue adjustment
    - Random order of transformations
RandomSharpnessSharpen- Similar core functionality: sharpen images
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Parameter specification:
    - Kornia: Single sharpness parameter (default: 0.5)
    - Albumentations: More detailed control with:
      * alpha (0.2, 0.5) for effect visibility
      * lightness (0.5, 1.0) for contrast
      * method choice ('kernel' or 'gaussian')
      * kernel_size and sigma for gaussian method
  2. Implementation methods:
    - Kornia: Single approach
    - Albumentations: Two methods:
      * Kernel-based using Laplacian operator
      * Gaussian interpolation
  3. Documentation:
    - Albumentations provides detailed mathematical formulation and references
RandomSnowRandomSnow- Similar core functionality: add snow effects to images
- Key differences:
  1. Parameter specification:
    - Kornia: snow_coefficient (0.5, 0.5), brightness (2, 2), p=1.0
    - Albumentations: snow_point_range (0.1, 0.3), brightness_coeff (2.5), p=0.5
  2. Implementation methods:
    - Kornia: Single approach
    - Albumentations: Two methods:
      * "bleach": Simple pixel value thresholding
      * "texture": Advanced snow texture simulation
  3. Additional in Albumentations:
    - Detailed snow simulation with:
      * HSV color space manipulation
      * Gaussian noise for texture
      * Depth effect simulation
      * Sparkle effects
  4. Documentation:
    - Albumentations provides detailed mathematical formulation and implementation notes
RandomSolarizeSolarize- Similar core functionality: invert pixel values above threshold
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Parameter specification:
    - Kornia: Two parameters:
      * thresholds (default: 0.1) for threshold range
      * additions (default: 0.1) for value adjustment
    - Albumentations: Single parameter:
      * threshold_range (default: (0.5, 0.5))
  2. Threshold handling:
    - Kornia: Generates from (0.5 - x, 0.5 + x) for float input
    - Albumentations: Direct range specification, scaled by image type max value
  3. Documentation:
    - Albumentations provides:
      * Detailed examples for both uint8 and float32 images
      * Clear mathematical formulation
      * Image type-specific behavior explanation
CenterCropCenterCrop- Similar core functionality: crop center of image
- Key similarities:
  1. Both have default probability p=1.0
- Key differences:
  1. Size specification:
    - Kornia: Single size parameter (int or tuple)
    - Albumentations: Separate height and width parameters
  2. Additional features:
    - Kornia:
      * align_corners for interpolation
      * resample mode selection
      * cropping_mode ('slice' or 'resample')
    - Albumentations:
      * pad_if_needed for handling small images
      * border_mode for padding method
      * fill and fill_mask for padding values
      * pad_position options
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Supports images, masks, bboxes, and keypoints
PadToPadIfNeeded- Can achieve same core functionality
- Key similarities:
  1. Both have default probability p=1.0
  2. Can pad to exact size:
    - Kornia: size=(height, width)
    - Albumentations: min_height=height, min_width=width
- Key differences:
  1. Parameter naming:
    - Kornia: Single size tuple
    - Albumentations: Separate dimension parameters
  2. Additional features:
    - Kornia:
      * Simple pad_mode selection
      * Single pad_value
    - Albumentations:
      * Flexible position options
      * Separate fill and fill_mask
      * Optional divisibility padding
      * Multiple target support
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomAffineAffine- Similar core functionality: apply affine transformations
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support rotation, translation, scaling, and shear
- Key differences:
  1. Parameter specification:
    - Kornia:
      * degrees for rotation
      * translate as fraction
      * scale as tuple
      * shear in degrees
    - Albumentations:
      * More flexible parameter formats
      * Supports both percent and pixel translation
      * Dictionary format for independent axis control
  2. Additional features in Albumentations:
    * fit_output for automatic size adjustment
    * keep_ratio for aspect ratio preservation
    * rotate_method options
    * balanced_scale for even scale distribution
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, keypoints, bboxes
RandomCropRandomCrop- Similar core functionality: randomly crop image patches
- Key similarities:
  1. Both have default probability p=1.0
  2. Both support padding if needed
- Key differences:
  1. Size specification:
    - Kornia: Single size tuple (height, width)
    - Albumentations: Separate height and width parameters
  2. Padding options:
    - Kornia:
      * Flexible padding sizes (int, tuple[2], tuple[4])
      * Multiple padding modes (constant, reflect, replicate)
      * Single fill value
    - Albumentations:
      * Simpler padding interface
      * Separate fill values for image and mask
      * Flexible pad positioning
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomElasticTransformElasticTransform- Similar core functionality: apply elastic deformations
- Key similarities:
  1. Both have default probability p=0.5
  2. Both use Gaussian smoothing for displacement fields
  3. Both support independent control of x/y deformations:
    - Kornia: via separate values in sigma/alpha tuples
    - Albumentations: via same_dxdy parameter
- Key differences:
  1. Parameter specification:
    - Kornia:
      * kernel_size tuple (63, 63)
      * sigma tuple (32.0, 32.0)
      * alpha tuple (1.0, 1.0)
    - Albumentations:
      * Single sigma (default: 50.0)
      * Single alpha (default: 1.0)
  2. Additional features:
    - Kornia:
      * Control over padding mode
    - Albumentations:
      * approximate mode for faster processing
      * Choice of noise distribution
      * Separate mask interpolation
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomErasingErasing- Similar core functionality: randomly erase rectangular regions
- Key similarities:
  1. Both have default probability p=0.5
  2. Same default parameters:
    * scale (0.02, 0.33)
    * ratio (0.3, 3.3)
- Key differences:
  1. Fill value options:
    - Kornia: Simple numeric value (default: 0.0)
    - Albumentations: Rich fill options:
      * Numeric values
      * "random" per pixel
      * "random_uniform" per region
      * "inpaint_telea" method
      * "inpaint_ns" method
  2. Additional features in Albumentations:
    * Separate mask_fill value
    * Support for masks, bboxes, keypoints
    * Inpainting options for more natural-looking results
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomFisheyeOpticalDistortion- Similar core functionality: apply optical/fisheye distortion
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support fisheye distortion
- Key differences:
  1. Parameter specification:
    - Kornia:
      * Separate center_x, center_y for distortion center
      * gamma for distortion strength
    - Albumentations:
      * Single distort_limit parameter
      * mode selection ('camera' or 'fisheye')
  2. Distortion models:
    - Kornia: Fisheye only
    - Albumentations:
      * Camera matrix model
      * Fisheye model
  3. Additional features in Albumentations:
    * Separate interpolation methods for image and mask
    * Support for masks, bboxes, keypoints
  4. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomHorizontalFlipHorizontalFlip- Similar core functionality: flip image horizontally
- Key similarities:
  1. Both have default probability p=0.5
  2. Simple operation with same visual result
- Key differences:
  1. Batch handling:
    - Kornia:
      * Additional p_batch parameter
      * same_on_batch option
    - Albumentations: No batch-specific parameters
  2. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomPerspectivePerspective- Similar core functionality: apply perspective transformation
- Key similarities:
  1. Both have default probability p=0.5
  2. Both transform image by moving corners
  3. Both support different interpolation methods:
    - Kornia: via resample (BILINEAR, NEAREST)
    - Albumentations: via interpolation (INTER_LINEAR, INTER_NEAREST, etc.)
- Key differences:
  1. Distortion control:
    - Kornia:
      * distortion_scale (0 to 1, default: 0.5)
      * sampling_method ('basic' or 'area_preserving')
    - Albumentations:
      * scale tuple for corner movement range
      * fit_output option for image capture
  2. Output handling:
    - Kornia:
      * align_corners parameter
      * keepdim for batch form
    - Albumentations:
      * keep_size for output dimensions
      * Border mode and fill options
      * Separate mask interpolation
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, keypoints, bboxes
RandomResizedCropRandomResizedCrop- Similar core functionality: crop random patches and resize
- Key similarities:
  1. Both have default probability p=1.0
  2. Same default parameters:
    * scale (0.08, 1.0)
    * ratio (~0.75, ~1.33)
  3. Both support different interpolation methods:
    - Kornia: via resample
    - Albumentations: via interpolation
- Key differences:
  1. Implementation options:
    - Kornia:
      * cropping_mode ('slice' or 'resample')
      * align_corners parameter
      * keepdim for batch form
    - Albumentations:
      * Separate mask interpolation
      * Fallback to center crop after 10 attempts
  2. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomRotation90RandomRotate90- Similar core functionality: rotate image by 90 degrees
- Key similarities:
  1. Both have default probability p=0.5
  2. Both rotate in 90-degree increments
- Key differences:
  1. Rotation control:
    - Kornia:
      * times parameter to specify range of rotations
      * resample and align_corners for interpolation
    - Albumentations:
      * Simpler implementation (0-3 rotations)
  2. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomRotationRotate- Similar core functionality: rotate image by random angle
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support different interpolation methods
- Key differences:
  1. Angle specification:
    - Kornia: degrees parameter (if single value, range is (-degrees, +degrees))
    - Albumentations: limit parameter (default: (-90, 90))
  2. Additional features:
    - Kornia:
      * align_corners for interpolation
    - Albumentations:
      * Border mode options
      * Fill values for padding
      * rotate_method for bboxes
      * crop_border option
      * Separate mask interpolation
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomShearAffine (shear parameter)- Similar core functionality: apply shear transformation
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support different interpolation methods
  3. Both support independent x/y shear control
- Key differences:
  1. Parameter specification:
    - Kornia:
      * Dedicated shear transform
      * shear parameter supports float, tuple(2), or tuple(4)
      * Simple padding modes (zeros, border, reflection)
    - Albumentations:
      * Part of general Affine transform
      * shear supports number, tuple, or dict format
      * More border modes and fill options
  2. Additional features in Albumentations:
    * Separate mask interpolation
    * fit_output option
    * Combined with other affine transforms
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, keypoints, bboxes
RandomThinPlateSplineThinPlateSpline- Similar core functionality: apply smooth, non-rigid deformations
- Key similarities:
  1. Both have default probability p=0.5
  2. Both use thin plate spline algorithm
  3. Both support interpolation options
- Key differences:
  1. Deformation control:
    - Kornia:
      * Single scale parameter (default: 0.2)
      * Fixed control point grid
    - Albumentations:
      * scale_range tuple for range of deformation
      * Configurable num_control_points
  2. Implementation details:
    - Kornia:
      * align_corners parameter
      * Binary mode choice (bilinear/nearest)
    - Albumentations:
      * OpenCV interpolation flags
      * More granular control over grid
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, keypoints, bboxes
RandomVerticalFlipVerticalFlip- Similar core functionality: flip image vertically
- Key similarities:
  1. Both have default probability p=0.5
  2. Simple operation with same visual result
- Key differences:
  1. Implementation:
    - Kornia:
      * Additional p_batch parameter
    - Albumentations:
      * Simpler implementation
  2. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints

Key Differences

Compared to TorchVision

  • Albumentations operates on numpy arrays instead of PyTorch tensors
  • Albumentations typically provides more parameters for fine-tuning transformations
  • Most Albumentations transforms support both image and mask augmentation
  • Better support for bounding box and keypoint augmentation

Compared to Kornia

  • Kornia operates directly on GPU tensors, while Albumentations works with numpy arrays
  • Albumentations provides more comprehensive support for object detection and segmentation tasks
  • Albumentations typically offers better performance for CPU-based augmentations

Performance Comparison

According to benchmarking results, Albumentations generally offers superior CPU performance compared to TorchVision and Kornia for most transforms. Here are some key highlights: Common Transforms Performance (images/second, higher is better)

TransformAlbumentationsTorchVisionKorniaNotes
HorizontalFlip8,618914390Albumentations is ~9x faster than TorchVision, ~22x faster than Kornia
VerticalFlip22,8473,1981,212Albumentations is ~7x faster than TorchVision, ~19x faster than Kornia
RandomResizedCrop2,828511287Albumentations is ~5.5x faster than TorchVision, ~10x faster than Kornia
Normalize1,196519626Albumentations is ~2x faster than both
ColorJitter6284655Albumentations is ~13x faster than both

Key Performance Insights:

  • Basic Operations: Albumentations excels at basic transforms like flips and crops, often being 5-20x faster than alternatives
  • Complex Operations: For more complex transforms like elastic deformation, the performance gap narrows
  • Memory Efficiency: Working with numpy arrays (Albumentations) is generally more memory efficient than tensor operations (Kornia/TorchVision) on CPU

When to Choose Each Library:

  • Albumentations: Best choice for CPU-based preprocessing pipelines and when maximum performance is needed
  • Kornia: Consider when doing augmentation on GPU with existing PyTorch tensors
  • TorchVision: Good choice when deeply integrated into PyTorch ecosystem and GPU performance isn't critical

Note: Benchmarks performed on macOS-15.0.1-arm64 with Python 3.12.7. Your results may vary based on hardware and setup.

Code Examples

TorchVision to Albumentations

# TorchVision
transforms = T.Compose([
    T.RandomHorizontalFlip(p=0.5),
    T.RandomRotation(10),
    T.Normalize(mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225])
])

# Albumentations equivalent
transforms = A.Compose([
    A.HorizontalFlip(p=0.5),
    A.Rotate(limit=10),
    A.Normalize(mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225])
])

Kornia to Albumentations

# Kornia
transforms = K.AugmentationSequential(
    K.RandomHorizontalFlip(p=0.5),
    K.RandomRotation(degrees=10),
    K.Normalize(mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225])
)

# Albumentations equivalent
transforms = A.Compose([
    A.HorizontalFlip(p=0.5),
    A.Rotate(limit=10),
    A.Normalize(mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225])
])

Additional Resources