Skip to content

Transform Library Comparison Guide

This guide helps you find equivalent transforms between Albumentations and other popular libraries (torchvision and Kornia).

Key Differences

Compared to TorchVision

  • Albumentations operates on numpy arrays (TorchVision uses PyTorch tensors)
  • More parameters for fine-tuning transformations
  • Built-in support for mask augmentation
  • Better handling of bounding boxes and keypoints

Compared to Kornia

  • CPU-based numpy operations (Kornia uses GPU tensors)
  • More comprehensive support for detection/segmentation
  • Generally better CPU performance
  • Simpler API for common tasks

Common Transform Mappings

Basic Geometric Transforms

TorchVision Transform Albumentations Equivalent Notes
Resize Resize / LongestMaxSize - TorchVision's Resize combines two Albumentations behaviors:
  1. When given (h,w): equivalent to Albumentations Resize
  2. When given single int + max_size: similar to LongestMaxSize
- Albumentations allows separate interpolation method for masks
- TorchVision has antialias parameter, Albumentations doesn't
ScaleJitter OneOf + multiple Resize - Can be approximated in Albumentations using OneOf container with multiple Resize transforms
- Example:
  transforms = A.OneOf([
    A.Resize(height=int(target_h * scale), width=int(target_w * scale))
    for scale in np.linspace(0.1, 2.0, num=20)
  ])
- Not exactly the same as continuous random scaling, but provides similar functionality
RandomShortestSize OneOf + SmallestMaxSize - Can be approximated in Albumentations using:
  transforms = A.OneOf([
    A.SmallestMaxSize(max_size=size, max_height=max_size, max_width=max_size)
    for size in [480, 512, 544, 576, 608]
  ])
- Randomly selects size for shortest side while maintaining aspect ratio
- Optional max_size parameter limits longest side
- TorchVision has antialias parameter, Albumentations doesn't
RandomResize OneOf + Resize - TorchVision: randomly selects single size S between min_size and max_size, sets both width and height to S
- No direct equivalent in Albumentations (RandomScale preserves aspect ratio)
- Can be approximated using:
  transforms = A.OneOf([
    A.Resize(size, size)
    for size in range(min_size, max_size + 1, step)
  ])
RandomCrop RandomCrop - Both perform random cropping with similar core functionality
- Key differences:
  1. TorchVision accepts single int for square crop, Albumentations requires both height and width
  2. Padding options differ:
    - TorchVision: supports padding parameter for pre-padding
    - Albumentations: offers pad_position parameter ('center', 'top_left', etc.)
  3. Fill value handling:
    - TorchVision: supports dict mapping for different types
    - Albumentations: separate fill and fill_mask parameters
  4. Padding modes:
    - TorchVision: 'constant', 'edge', 'reflect', 'symmetric'
    - Albumentations: uses OpenCV border modes
RandomResizedCrop RandomResizedCrop - Nearly identical functionality and parameters
- Key differences:
  1. TorchVision accepts single int for square output, Albumentations requires (height, width) tuple
  2. Default values are the same (scale=(0.08, 1.0), ratio=(0.75, 1.3333))
  3. Albumentations adds:
    - Separate mask_interpolation parameter
    - Probability parameter p
RandomIoUCrop RandomSizedBBoxSafeCrop - Both ensure safe cropping with respect to bounding boxes
- Key differences:
  1. TorchVision:
    - Implements exact SSD paper approach
    - Uses IoU-based sampling strategy
    - Requires explicit sanitization of boxes after crop
  2. Albumentations:
    - Simpler approach ensuring bbox safety
    - Directly specifies target size
    - Automatically handles bbox cleanup
- For exact SSD-style cropping, might need custom implementation in Albumentations
CenterCrop CenterCrop - Both crop the center part of the input
- Key differences:
  1. Size specification:
    - TorchVision: accepts single int for square crop or (height, width) tuple
    - Albumentations: requires separate height and width parameters
  2. Padding behavior:
    - TorchVision: always pads with 0 if image is smaller
    - Albumentations: optional padding with pad_if_needed
  3. Albumentations adds:
    - Configurable padding mode and position
    - Separate fill values for image and mask
    - Probability parameter p
RandomHorizontalFlip HorizontalFlip - Identical functionality
- Both have default probability p=0.5
- Only naming difference: TorchVision includes "Random" in name
RandomVerticalFlip VerticalFlip - Identical functionality
- Both have default probability p=0.5
- Only naming difference: TorchVision includes "Random" in name
Pad Pad - Similar core padding functionality
- Both support:
  - Single int for all sides
  - (pad_x, pad_y) for symmetric padding
  - (left, top, right, bottom) for per-side padding
- Key differences:
  1. Padding modes:
    - TorchVision: 'constant', 'edge', 'reflect', 'symmetric'
    - Albumentations: uses OpenCV border modes
  2. Fill value handling:
    - TorchVision: supports dict mapping for different types
    - Albumentations: separate fill and fill_mask parameters
  3. Albumentations adds:
    - Probability parameter p
RandomZoomOut RandomScale + PadIfNeeded - No direct equivalent in Albumentations
- Can be approximated by combining:
  A.Compose([
    A.RandomScale(scale_limit=(0.0, 3.0), p=0.5), # scale_limit=(0.0, 3.0) maps to side_range=(1.0, 4.0)
    A.PadIfNeeded(min_height=height, min_width=width, border_mode=cv2.BORDER_CONSTANT, value=fill)
  ])
- Key differences:
  1. TorchVision implements specific SSD paper approach
  2. Albumentations requires composition of two transforms
RandomRotation Rotate - Similar core rotation functionality but with different parameters
- Key differences:
  1. Angle specification:
    - TorchVision: degrees parameter (-degrees, +degrees) or (min, max)
    - Albumentations: limit parameter (-limit, +limit) or (min, max)
  2. Output size control:
    - TorchVision: expand=True/False
    - Albumentations: crop_border=True/False
  3. Additional Albumentations features:
    - Separate mask interpolation
    - Bbox rotation methods ('largest_box' or 'ellipse')
    - More border modes
    - Probability parameter p
  4. Center specification:
    - TorchVision: supports custom center point
    - Albumentations: always uses image center
RandomAffine Affine - Both support core affine operations (translation, rotation, scale, shear)
- Key differences:
  1. Parameter specification:
    - TorchVision: single parameters for each transform
    - Albumentations: more flexible with dict options for x/y axes
  2. Scale handling:
    - Albumentations adds keep_ratio and balanced_scale
    - Albumentations supports independent x/y scaling
  3. Translation:
    - TorchVision: fraction only
    - Albumentations: both percent and pixels
  4. Additional Albumentations features:
    - fit_output to adjust image plane
    - Separate mask interpolation
    - More border modes
    - Bbox rotation methods
    - Probability parameter p
RandomPerspective Perspective - Both apply random perspective transformations
- Key differences:
  1. Distortion control:
    - TorchVision: single distortion_scale (0 to 1)
    - Albumentations: scale tuple for corner movement range
  2. Output handling:
    - Albumentations adds keep_size and fit_output options
    - Can control whether to maintain original size
  3. Additional Albumentations features:
    - Separate mask interpolation
    - More border modes
    - Better control over output size and fitting
ElasticTransform ElasticTransform - Similar core functionality: both apply elastic deformations to images
- Key differences:
  1. Parameters have opposite meanings:
    - TorchVision: alpha (displacement), sigma (smoothness)
    - Albumentations: alpha (smoothness), sigma (displacement)
  2. Default values reflect this difference:
    - TorchVision: alpha=50.0, sigma=5.0
    - Albumentations: alpha=1.0, sigma=50.0
- Note on implementation:
  - Albumentations follows Simard et al. 2003 paper more closely:
    - σ should be ~0.05 * image_size
    - α should be proportional to σ
- Additional Albumentations features:
  - approximate mode
  - same_dxdy option
  - Choice of noise distribution
  - Separate mask interpolation
ColorJitter ColorJitter - Similar core functionality: both randomly adjust brightness, contrast, saturation, and hue
- Key similarities:
  1. Same parameter names and meanings
  2. Same value ranges (e.g., hue should be in [-0.5, 0.5])
  3. Random order of transformations
- Key differences:
  1. Default values:
    - TorchVision: all None by default
    - Albumentations: defaults to (0.8, 1.2) for brightness/contrast/saturation
  2. Implementation:
    - TorchVision: uses Pillow
    - Albumentations: uses OpenCV (may produce slightly different results)
  3. Additional in Albumentations:
    - Explicit probability parameter p
    - Value saturation instead of uint8 overflow
RandomChannelPermutation ChannelShuffle - Both randomly permute image channels
- Key similarities:
  1. Same core functionality
  2. Work on multi-channel images (typically RGB)
- Key differences:
  1. Naming convention only
  2. Albumentations adds:
    - Probability parameter p
RandomPhotometricDistort RandomOrder + ColorJitter + ChannelShuffle - TorchVision's transform is from SSD paper, combines:
  1. Color jittering (brightness, contrast, saturation, hue)
  2. Random channel permutation
- Can be replicated in Albumentations using:
  A.RandomOrder([
    A.ColorJitter(brightness=(0.875, 1.125),
                contrast=(0.5, 1.5),
                saturation=(0.5, 1.5),
                hue=(-0.05, 0.05),
                p=0.5),
    A.ChannelShuffle(p=0.5)
  ])
Grayscale ToGray - Similar core functionality: convert RGB to grayscale
- Key differences:
  1. Output channels:
    - TorchVision: only 1 or 3 channels
    - Albumentations: supports any number of output channels
  2. Conversion methods:
    - TorchVision: single method (weighted RGB)
    - Albumentations: multiple methods via method parameter:
      • weighted_average (default, same as TorchVision)
      • from_lab, desaturation, average, max, pca
  3. Additional in Albumentations:
    - Probability parameter p
    - More flexible channel handling
RGB ToRGB - Similar core functionality: convert to RGB format
- Key differences:
  1. Input handling:
    - TorchVision: accepts 1 or 3 channel inputs
    - Albumentations: only accepts single-channel inputs
  2. Output channels:
    - TorchVision: always 3 channels
    - Albumentations: configurable via num_output_channels
  3. Behavior:
    - TorchVision: converts to RGB if not already RGB
    - Albumentations: strictly grayscale to RGB conversion
  4. Additional in Albumentations:
    - Probability parameter p
RandomGrayscale ToGray - Similar core functionality: convert to grayscale with probability
- Key differences:
  1. Default probability:
    - TorchVision: p=0.1
    - Albumentations: p=0.5
  2. Output handling:
    - TorchVision: always preserves input channels
    - Albumentations: configurable output channels
  3. Conversion methods:
    - TorchVision: single method
    - Albumentations: multiple methods with different channel support:
      • weighted_average, from_lab: 3-channel only
      • desaturation, average, max, pca: any number of channels
  4. Channel requirements:
    - TorchVision: works with 1 or 3 channels
    - Albumentations: depends on method chosen
GaussianBlur GaussianBlur - Similar core functionality: apply Gaussian blur with random kernel size
- Key similarities:
  1. Both support random kernel sizes
  2. Both support random sigma values
- Key differences:
  1. Parameter specification:
    - TorchVision: kernel_size (exact size), sigma (range)
    - Albumentations: blur_limit (size range), sigma_limit (range)
  2. Kernel size constraints:
    - TorchVision: must specify exact size
    - Albumentations: can specify range (3, 7) or auto-compute
  3. Additional in Albumentations:
    - Probability parameter p
    - Auto-computation of kernel size from sigma
GaussianNoise GaussNoise - Similar core functionality: add Gaussian noise to images
- Key similarities:
  1. Both support mean and standard deviation parameters
- Key differences:
  1. Parameter ranges:
    - TorchVision: fixed values for mean and sigma
    - Albumentations: ranges for both (std_range, mean_range)
  2. Value handling:
    - TorchVision: expects float [0,1], has clip option
    - Albumentations: auto-scales based on dtype
  3. Additional in Albumentations:
    - Per-channel noise option
    - Noise scale factor for performance
    - Probability parameter p
RandomInvert Invert - Similar core functionality: invert image colors
- Key similarities:
  1. Both invert pixel values
  2. Both have default probability of 0.5
- Key differences:
  1. Value handling:
    - TorchVision: works with [0,1] float tensors
    - Albumentations: auto-handles uint8 (255) and float32 (1.0)
RandomPosterize Posterize - Similar core functionality: reduce color bits
- Key similarities:
  1. Both posterize images with probability p=0.5
- Key differences:
  1. Bits specification:
    - TorchVision: single fixed value [0-8]
    - Albumentations: flexible options with [1-7] (recommended):
      • Single value for all channels
      • Range (min_bits, max_bits)
      • Per-channel values [r,g,b]
      • Per-channel ranges [(r_min,r_max), ...]
  2. Practical range:
    - TorchVision: includes 0 (black) and 8 (unchanged)
    - Albumentations: recommended [1-7] for actual posterization
RandomSolarize Solarize - Similar core functionality: invert pixels above threshold
- Key similarities:
  1. Both have default probability p=0.5
  2. Both invert values above threshold
- Key differences:
  1. Threshold specification:
    - TorchVision: single fixed threshold value
    - Albumentations: range via threshold_range
  2. Value handling:
    - TorchVision: works with raw threshold values
    - Albumentations: uses normalized [0,1] range:
      • uint8: multiplied by 255
      • float32: multiplied by 1.0
RandomAdjustSharpness Sharpen - Similar core functionality: adjust image sharpness
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Parameter specification:
    - TorchVision: single sharpness_factor
      • 0: blurred
      • 1: original
      • 2: doubled sharpness
    - Albumentations: more controls:
      • alpha: effect visibility [0,1]
      • lightness: contrast control
  2. Method options:
    - TorchVision: single method
    - Albumentations: two methods:
      • 'kernel': Laplacian operator
      • 'gaussian': blur interpolation
RandomAutocontrast AutoContrast Same core functionality with identical parameters (p=0.5)
RandomEqualize Equalize - Similar core functionality: histogram equalization
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Additional Albumentations features:
    - Choice of algorithm (cv/pil methods)
    - Per-channel or luminance-based equalization
    - Optional masking support
Normalize Normalize - Similar core functionality: normalize image values
- Key similarities:
  1. Both support mean/std normalization
- Key differences:
  1. Normalization options:
    - TorchVision: only (input - mean) / std
    - Albumentations: multiple methods:
      • standard (same as TorchVision)
      • image (global stats)
      • image_per_channel
      • min_max
      • min_max_per_channel
  2. Additional in Albumentations:
    - max_pixel_value parameter
    - Probability parameter p
RandomErasing Erasing - Similar core functionality: randomly erase image regions
- Key similarities:
  1. Both have default probability p=0.5
  2. Same default scale=(0.02, 0.33)
  3. Same default ratio=(0.3, 3.3)
- Key differences:
  1. Fill value options:
    - TorchVision: number/tuple or 'random'
    - Albumentations: additional options:
      • random_uniform
      • inpaint_telea
      • inpaint_ns
  2. Additional in Albumentations:
    - Mask fill value option
    - Support for masks, bboxes, keypoints
JPEG ImageCompression - Similar core functionality: apply JPEG compression
- Key similarities:
  1. Both use quality range 1-100
  2. Both support quality ranges
- Key differences:
  1. Compression types:
    - TorchVision: JPEG only
    - Albumentations: JPEG and WebP
  2. Additional in Albumentations:
    - Probability parameter p
    - Default quality range (99, 100)

Kornia to Albumentations

Kornia Albumentations Notes
ColorJitter ColorJitter - Similar core functionality: randomly adjust brightness, contrast, saturation, and hue
- Key similarities:
  1. Both support same parameters (brightness, contrast, saturation, hue)
  2. Both allow float or tuple ranges for parameters
- Key differences:
  1. Default values:
    - Albumentations: (0.8, 1.2) for brightness/contrast/saturation
    - Kornia: 0.0 for all parameters
  2. Default probability:
    - Albumentations: p=0.5
    - Kornia: p=1.0
  3. Note: Kornia recommends using ColorJiggle instead as it follows color theory better
RandomAutoContrast AutoContrast - Similar core functionality: enhance image contrast automatically
- Key similarities:
  1. Both stretch intensity range to use full range
  2. Both preserve relative intensities
- Key differences:
  1. Default probability:
    - Albumentations: p=0.5
    - Kornia: p=1.0
  2. Additional in Kornia:
    - clip_output parameter to control value clipping
RandomBoxBlur Blur - Similar core functionality: apply box/average blur to images
- Key similarities:
  1. Both have default probability p=0.5
  2. Both apply box/average blur filter
- Key differences:
  1. Kernel size specification:
    - Albumentations: blur_limit parameter for range (e.g., (3, 7))
    - Kornia: fixed kernel_size tuple (default (3, 3))
  2. Additional in Kornia:
    - border_type parameter ('reflect', 'replicate', 'circular')
    - normalized parameter for L1 norm control
RandomBrightness RandomBrightnessContrast - Different scope:
  - Kornia: brightness only
  - Albumentations: combines brightness and contrast
- Key differences:
  1. Parameter specification:
    - Kornia: brightness tuple (default: (1.0, 1.0))
    - Albumentations: brightness_limit (default: (-0.2, 0.2))
  2. Default probability:
    - Kornia: p=1.0
    - Albumentations: p=0.5
  3. Additional in Albumentations:
    - brightness_by_max parameter for adjustment method
    - ensure_safe_range to prevent overflow/underflow
    - Combined contrast control
  4. Additional in Kornia:
    - clip_output parameter
RandomChannelDropout ChannelDropout - Similar core functionality: randomly drop image channels
- Key similarities:
  1. Both have default probability p=0.5
  2. Both allow specifying fill value for dropped channels
- Key differences:
  1. Channel drop specification:
    - Kornia: fixed num_drop_channels (default: 1)
    - Albumentations: flexible channel_drop_range tuple (default: (1, 1))
  2. Error handling:
    - Albumentations: explicit checks for single-channel images and invalid ranges
    - Kornia: simpler parameter validation
RandomChannelShuffle ChannelShuffle - Identical core functionality: randomly shuffle image channels
RandomClahe CLAHE - Similar core functionality: apply Contrast Limited Adaptive Histogram Equalization
- Key similarities:
  1. Both have default probability p=0.5
  2. Both allow configuring grid size and clip limit
- Key differences:
  1. Parameter defaults:
    - Kornia: clip_limit=(40.0, 40.0), grid_size=(8, 8)
    - Albumentations: clip_limit=(1, 4), tile_grid_size=(8, 8)
  2. Additional in Kornia:
    - slow_and_differentiable parameter for implementation choice
RandomContrast RandomBrightnessContrast - Different scope:
  - Kornia: contrast only
  - Albumentations: combines brightness and contrast
- Key differences:
  1. Parameter specification:
    - Kornia: contrast tuple (default: (1.0, 1.0))
    - Albumentations: contrast_limit (default: (-0.2, 0.2))
  2. Default probability:
    - Kornia: p=1.0
    - Albumentations: p=0.5
  3. Additional in Albumentations:
    - ensure_safe_range to prevent overflow/underflow
    - Combined brightness control
  4. Additional in Kornia:
    - clip_output parameter
RandomEqualize Equalize - Similar core functionality: apply histogram equalization
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Additional in Albumentations:
    - mode parameter to choose between 'cv' and 'pil' methods
    - by_channels parameter for per-channel or luminance-based equalization
    - mask parameter to selectively apply equalization
    - mask_params for dynamic mask generation
RandomGamma RandomGamma - Similar core functionality: apply random gamma correction
- Key differences:
  1. Parameter specification:
    - Kornia: separate gamma (1.0, 1.0) and gain (1.0, 1.0) tuples
    - Albumentations: single gamma_limit (80, 120) as percentage range
  2. Default probability:
    - Kornia: p=1.0
    - Albumentations: p=0.5
  3. Additional in Albumentations:
    - eps parameter to prevent numerical errors
RandomGaussianBlur GaussianBlur - Similar core functionality: apply Gaussian blur with random parameters
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support kernel size and sigma parameters
- Key differences:
  1. Parameter specification:
    - Kornia: requires explicit kernel_size and sigma range
    - Albumentations: blur_limit (default: (3, 7)) and sigma_limit (default: 0)
  2. Additional in Kornia:
    - border_type parameter for padding mode
    - separable parameter for 1D convolution optimization
RandomGaussianIllumination Illumination - Similar core functionality: apply illumination effects
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support controlling effect intensity and position
- Key differences:
  1. Scope:
    - Kornia: Gaussian illumination patterns only
    - Albumentations: Multiple modes (linear, corner, gaussian)
  2. Parameter ranges:
    - Kornia: gain=(0.01, 0.15), center=(0.1, 0.9), sigma=(0.2, 1.0)
    - Albumentations: intensity_range=(0.01, 0.2), center_range=(0.1, 0.9), sigma_range=(0.2, 1.0)
  3. Additional in Albumentations:
    - mode parameter for different effect types
    - effect_type for brighten/darken control
    - angle_range for linear gradients
  4. Additional in Kornia:
    - sign parameter for effect direction
RandomGaussianNoise GaussNoise - Similar core functionality: add Gaussian noise to images
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Parameter specification:
    - Kornia: fixed mean (default: 0.0) and std (default: 1.0)
    - Albumentations: ranges via std_range (0.2, 0.44) and mean_range (0.0, 0.0)
  2. Additional in Albumentations:
    - per_channel parameter for independent channel noise
    - noise_scale_factor for performance optimization
    - Automatic value scaling based on image dtype
RandomGrayscale ToGray - Similar core functionality: convert images to grayscale
- Key differences:
  1. Default probability:
    - Kornia: p=0.1
    - Albumentations: p=0.5
  2. Conversion options:
    - Kornia: customizable rgb_weights for channel mixing
    - Albumentations: multiple method options (weighted_average, from_lab, desaturation, average, max, pca)
  3. Output control:
    - Kornia: always 3-channel output
    - Albumentations: configurable num_output_channels
RandomHue ColorJitter (hue parameter) - Similar core functionality: adjust image hue
- Key differences:
  1. Scope:
    - Kornia: hue-only transform
    - Albumentations: part of ColorJitter with brightness, contrast, and saturation
  2. Default values:
    - Kornia: hue=(0.0, 0.0), p=1.0
    - Albumentations: hue=(-0.5, 0.5), p=0.5
RandomInvert InvertImg - Similar core functionality: invert image values
- Key differences:
  1. Maximum value handling:
    - Kornia: configurable via max_val parameter (default: 1.0)
    - Albumentations: automatically determined by dtype (255 for uint8, 1.0 for float32)
RandomJPEG ImageCompression - Similar core functionality: apply image compression
- Key differences:
  1. Compression options:
    - Kornia: JPEG only
    - Albumentations: supports both JPEG and WebP
  2. Quality specification:
    - Kornia: jpeg_quality (default: 50.0)
    - Albumentations: quality_range (default: (99, 100))
  3. Default probability:
    - Kornia: p=1.0
    - Albumentations: p=0.5
RandomLinearCornerIllumination Illumination (corner mode) - Similar core functionality: apply corner illumination effects
- Key differences:
  1. Scope:
    - Kornia: corner illumination only
    - Albumentations: part of general Illumination transform with multiple modes
  2. Parameter specification:
    - Kornia: gain (0.01, 0.2) and sign (-1.0, 1.0)
    - Albumentations: intensity_range (0.01, 0.2) and effect_type (brighten/darken/both)
  3. Additional in Albumentations:
    - Multiple illumination modes (linear, corner, gaussian)
    - More control over effect parameters
RandomLinearIllumination Illumination (linear mode) - Similar core functionality: apply linear illumination effects
- Key differences:
  1. Scope:
    - Kornia: linear illumination only
    - Albumentations: part of general Illumination transform with multiple modes
  2. Parameter specification:
    - Kornia: gain (0.01, 0.2) and sign (-1.0, 1.0)
    - Albumentations: intensity_range (0.01, 0.2), effect_type (brighten/darken/both), and angle_range (0, 360)
  3. Additional in Albumentations:
    - Multiple illumination modes (linear, corner, gaussian)
    - Explicit angle control for gradient direction
RandomMedianBlur MedianBlur - Similar core functionality: apply median blur filter
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Kernel size specification:
    - Kornia: fixed kernel_size tuple (default: (3, 3))
    - Albumentations: range via blur_limit (default: (3, 7))
  2. Kernel constraints:
    - Albumentations: enforces odd kernel sizes
RandomMotionBlur MotionBlur - Similar core functionality: apply directional motion blur
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support angle and direction control
- Key differences:
  1. Kernel size specification:
    - Kornia: kernel_size as int or tuple
    - Albumentations: blur_limit (default: (3, 7))
  2. Angle control:
    - Kornia: angle parameter with symmetric range (-angle, angle)
    - Albumentations: angle_range (default: (0, 360))
  3. Additional in Albumentations:
    - allow_shifted parameter for kernel position control
RandomPlanckianJitter PlanckianJitter - Similar core functionality: apply physics-based color temperature variations
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support 'blackbody' and 'cied' modes
- Key differences:
  1. Temperature control:
    - Kornia: select_from parameter for discrete jitter selection
    - Albumentations: temperature_limit for continuous range
  2. Additional in Albumentations:
    - sampling_method parameter ('uniform' or 'gaussian')
    - More detailed control over temperature ranges
    - Better documentation of physics-based effects
RandomPlasmaBrightness PlasmaBrightnessContrast - Similar core functionality: apply fractal-based brightness adjustments
- Key similarities:
  1. Both have default probability p=0.5
  2. Both use Diamond-Square algorithm for pattern generation
- Key differences:
  1. Parameter specification:
    - Kornia: roughness (0.1, 0.7) and intensity (0.0, 1.0)
    - Albumentations: brightness_range (-0.3, 0.3), contrast_range (-0.3, 0.3), roughness (default: 3.0)
  2. Additional in Albumentations:
    - Combined brightness and contrast adjustment
    - plasma_size parameter for pattern detail control
    - More detailed mathematical formulation and documentation
RandomPlasmaContrast PlasmaBrightnessContrast - Similar core functionality: apply fractal-based contrast adjustments
- Key similarities:
  1. Both have default probability p=0.5
  2. Both use Diamond-Square algorithm for pattern generation
- Key differences:
  1. Parameter specification:
    - Kornia: roughness (0.1, 0.7) only
    - Albumentations: contrast_range (-0.3, 0.3), roughness (default: 3.0), plasma_size (default: 256)
  2. Scope:
    - Kornia: contrast-only adjustment
    - Albumentations: combined brightness and contrast adjustment
  3. Additional in Albumentations:
    - More detailed mathematical formulation
    - Pattern size control via plasma_size
RandomPlasmaShadow PlasmaShadow - Similar core functionality: apply fractal-based shadow effects
- Key similarities:
  1. Both have default probability p=0.5
  2. Both use Diamond-Square algorithm for pattern generation
- Key differences:
  1. Parameter specification:
    - Kornia: roughness (0.1, 0.7), shade_intensity (-1.0, 0.0), shade_quantity (0.0, 1.0)
    - Albumentations: shadow_intensity_range (0.3, 0.7), plasma_size (default: 256), roughness (default: 3.0)
  2. Additional in Albumentations:
    - Pattern size control via plasma_size
    - More intuitive intensity range (0 to 1)
    - More detailed mathematical formulation and documentation
RandomPosterize Posterize - Similar core functionality: reduce color bits in image
- Key similarities:
  1. Both have default probability p=0.5
  2. Both operate on color bit reduction
- Key differences:
  1. Bit specification:
    - Kornia: bits parameter (default: 3) with range (0, 8], can be float or tuple
    - Albumentations: num_bits parameter (default: 4) with range [1, 7], supports multiple formats:
      * Single int for all channels
      * Tuple for random range
      * List for per-channel specification
      * List of tuples for per-channel ranges
  2. Additional in Albumentations:
    - More flexible channel-wise control
    - More detailed documentation and mathematical background
RandomRain RandomRain - Similar core functionality: add rain effects to images
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Rain parameter specification:
    - Kornia: number_of_drops (1000, 2000), drop_height (5, 20), drop_width (-5, 5)
    - Albumentations: slant_range (-10, 10), drop_length (20), drop_width (1)
  2. Additional in Albumentations:
    - drop_color customization
    - blur_value for atmospheric effect
    - brightness_coefficient for lighting adjustment
    - rain_type presets (drizzle, heavy, torrential)
  3. Approach:
    - Kornia: Direct drop placement
    - Albumentations: More realistic simulation with slant, blur, and brightness effects
RandomRGBShift AdditiveNoise - Similar core functionality: add noise/shifts to image channels
- Key similarities:
  1. Both have default probability p=0.5
  2. Both can affect individual channels
- Key differences:
  1. Approach:
    - Kornia: Simple RGB channel shifts with individual limits
    - Albumentations: More sophisticated noise generation with multiple distributions
  2. Parameter specification:
    - Kornia: r_shift_limit, g_shift_limit, b_shift_limit (all default: 0.5)
    - Albumentations: Flexible noise configuration with:
      * Multiple noise types (uniform, gaussian, laplace, beta)
      * Different spatial modes (constant, per_pixel, shared)
      * Customizable distribution parameters
  3. Additional in Albumentations:
    - Performance optimization options
    - More detailed control over noise distribution
    - Spatial application modes
RandomSaltAndPepperNoise SaltAndPepper - Similar core functionality: apply salt and pepper noise to images
- Key similarities:
  1. Both have default probability p=0.5
  2. Both use same default parameters:
    - amount (0.01, 0.06)
    - salt_vs_pepper (0.4, 0.6)
- Key differences:
  1. Parameter flexibility:
    - Kornia: Supports single float or tuple for parameters
    - Albumentations: Requires tuples for ranges
  2. Documentation:
    - Albumentations provides:
      * Detailed mathematical formulation
      * Clear examples for different noise levels
      * Implementation notes and edge cases
      * References to academic sources
RandomSaturation ColorJitter - Different scope and functionality:
- Key differences:
  1. Scope:
    - Kornia: Saturation-only adjustment
    - Albumentations: Combined brightness, contrast, saturation, and hue adjustment
  2. Default parameters:
    - Kornia: saturation (1.0, 1.0), p=1.0
    - Albumentations: saturation (0.8, 1.2), p=0.5
  3. Implementation:
    - Kornia: Aligns with PIL/TorchVision implementation
    - Albumentations: Uses OpenCV with noted differences in HSV conversion
  4. Additional in Albumentations:
    - Brightness adjustment
    - Contrast adjustment
    - Hue adjustment
    - Random order of transformations
RandomSharpness Sharpen - Similar core functionality: sharpen images
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Parameter specification:
    - Kornia: Single sharpness parameter (default: 0.5)
    - Albumentations: More detailed control with:
      * alpha (0.2, 0.5) for effect visibility
      * lightness (0.5, 1.0) for contrast
      * method choice ('kernel' or 'gaussian')
      * kernel_size and sigma for gaussian method
  2. Implementation methods:
    - Kornia: Single approach
    - Albumentations: Two methods:
      * Kernel-based using Laplacian operator
      * Gaussian interpolation
  3. Documentation:
    - Albumentations provides detailed mathematical formulation and references
RandomSnow RandomSnow - Similar core functionality: add snow effects to images
- Key differences:
  1. Parameter specification:
    - Kornia: snow_coefficient (0.5, 0.5), brightness (2, 2), p=1.0
    - Albumentations: snow_point_range (0.1, 0.3), brightness_coeff (2.5), p=0.5
  2. Implementation methods:
    - Kornia: Single approach
    - Albumentations: Two methods:
      * "bleach": Simple pixel value thresholding
      * "texture": Advanced snow texture simulation
  3. Additional in Albumentations:
    - Detailed snow simulation with:
      * HSV color space manipulation
      * Gaussian noise for texture
      * Depth effect simulation
      * Sparkle effects
  4. Documentation:
    - Albumentations provides detailed mathematical formulation and implementation notes
RandomSolarize Solarize - Similar core functionality: invert pixel values above threshold
- Key similarities:
  1. Both have default probability p=0.5
- Key differences:
  1. Parameter specification:
    - Kornia: Two parameters:
      * thresholds (default: 0.1) for threshold range
      * additions (default: 0.1) for value adjustment
    - Albumentations: Single parameter:
      * threshold_range (default: (0.5, 0.5))
  2. Threshold handling:
    - Kornia: Generates from (0.5 - x, 0.5 + x) for float input
    - Albumentations: Direct range specification, scaled by image type max value
  3. Documentation:
    - Albumentations provides:
      * Detailed examples for both uint8 and float32 images
      * Clear mathematical formulation
      * Image type-specific behavior explanation
CenterCrop CenterCrop - Similar core functionality: crop center of image
- Key similarities:
  1. Both have default probability p=1.0
- Key differences:
  1. Size specification:
    - Kornia: Single size parameter (int or tuple)
    - Albumentations: Separate height and width parameters
  2. Additional features:
    - Kornia:
      * align_corners for interpolation
      * resample mode selection
      * cropping_mode ('slice' or 'resample')
    - Albumentations:
      * pad_if_needed for handling small images
      * border_mode for padding method
      * fill and fill_mask for padding values
      * pad_position options
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Supports images, masks, bboxes, and keypoints
PadTo PadIfNeeded - Can achieve same core functionality
- Key similarities:
  1. Both have default probability p=1.0
  2. Can pad to exact size:
    - Kornia: size=(height, width)
    - Albumentations: min_height=height, min_width=width
- Key differences:
  1. Parameter naming:
    - Kornia: Single size tuple
    - Albumentations: Separate dimension parameters
  2. Additional features:
    - Kornia:
      * Simple pad_mode selection
      * Single pad_value
    - Albumentations:
      * Flexible position options
      * Separate fill and fill_mask
      * Optional divisibility padding
      * Multiple target support
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomAffine Affine - Similar core functionality: apply affine transformations
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support rotation, translation, scaling, and shear
- Key differences:
  1. Parameter specification:
    - Kornia:
      * degrees for rotation
      * translate as fraction
      * scale as tuple
      * shear in degrees
    - Albumentations:
      * More flexible parameter formats
      * Supports both percent and pixel translation
      * Dictionary format for independent axis control
  2. Additional features in Albumentations:
    * fit_output for automatic size adjustment
    * keep_ratio for aspect ratio preservation
    * rotate_method options
    * balanced_scale for even scale distribution
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, keypoints, bboxes
RandomCrop RandomCrop - Similar core functionality: randomly crop image patches
- Key similarities:
  1. Both have default probability p=1.0
  2. Both support padding if needed
- Key differences:
  1. Size specification:
    - Kornia: Single size tuple (height, width)
    - Albumentations: Separate height and width parameters
  2. Padding options:
    - Kornia:
      * Flexible padding sizes (int, tuple[2], tuple[4])
      * Multiple padding modes (constant, reflect, replicate)
      * Single fill value
    - Albumentations:
      * Simpler padding interface
      * Separate fill values for image and mask
      * Flexible pad positioning
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomElasticTransform ElasticTransform - Similar core functionality: apply elastic deformations
- Key similarities:
  1. Both have default probability p=0.5
  2. Both use Gaussian smoothing for displacement fields
  3. Both support independent control of x/y deformations:
    - Kornia: via separate values in sigma/alpha tuples
    - Albumentations: via same_dxdy parameter
- Key differences:
  1. Parameter specification:
    - Kornia:
      * kernel_size tuple (63, 63)
      * sigma tuple (32.0, 32.0)
      * alpha tuple (1.0, 1.0)
    - Albumentations:
      * Single sigma (default: 50.0)
      * Single alpha (default: 1.0)
  2. Additional features:
    - Kornia:
      * Control over padding mode
    - Albumentations:
      * approximate mode for faster processing
      * Choice of noise distribution
      * Separate mask interpolation
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomErasing Erasing - Similar core functionality: randomly erase rectangular regions
- Key similarities:
  1. Both have default probability p=0.5
  2. Same default parameters:
    * scale (0.02, 0.33)
    * ratio (0.3, 3.3)
- Key differences:
  1. Fill value options:
    - Kornia: Simple numeric value (default: 0.0)
    - Albumentations: Rich fill options:
      * Numeric values
      * "random" per pixel
      * "random_uniform" per region
      * "inpaint_telea" method
      * "inpaint_ns" method
  2. Additional features in Albumentations:
    * Separate mask_fill value
    * Support for masks, bboxes, keypoints
    * Inpainting options for more natural-looking results
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomFisheye OpticalDistortion - Similar core functionality: apply optical/fisheye distortion
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support fisheye distortion
- Key differences:
  1. Parameter specification:
    - Kornia:
      * Separate center_x, center_y for distortion center
      * gamma for distortion strength
    - Albumentations:
      * Single distort_limit parameter
      * mode selection ('camera' or 'fisheye')
  2. Distortion models:
    - Kornia: Fisheye only
    - Albumentations:
      * Camera matrix model
      * Fisheye model
  3. Additional features in Albumentations:
    * Separate interpolation methods for image and mask
    * Support for masks, bboxes, keypoints
  4. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomHorizontalFlip HorizontalFlip - Similar core functionality: flip image horizontally
- Key similarities:
  1. Both have default probability p=0.5
  2. Simple operation with same visual result
- Key differences:
  1. Batch handling:
    - Kornia:
      * Additional p_batch parameter
      * same_on_batch option
    - Albumentations: No batch-specific parameters
  2. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomPerspective Perspective - Similar core functionality: apply perspective transformation
- Key similarities:
  1. Both have default probability p=0.5
  2. Both transform image by moving corners
  3. Both support different interpolation methods:
    - Kornia: via resample (BILINEAR, NEAREST)
    - Albumentations: via interpolation (INTER_LINEAR, INTER_NEAREST, etc.)
- Key differences:
  1. Distortion control:
    - Kornia:
      * distortion_scale (0 to 1, default: 0.5)
      * sampling_method ('basic' or 'area_preserving')
    - Albumentations:
      * scale tuple for corner movement range
      * fit_output option for image capture
  2. Output handling:
    - Kornia:
      * align_corners parameter
      * keepdim for batch form
    - Albumentations:
      * keep_size for output dimensions
      * Border mode and fill options
      * Separate mask interpolation
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, keypoints, bboxes
RandomResizedCrop RandomResizedCrop - Similar core functionality: crop random patches and resize
- Key similarities:
  1. Both have default probability p=1.0
  2. Same default parameters:
    * scale (0.08, 1.0)
    * ratio (~0.75, ~1.33)
  3. Both support different interpolation methods:
    - Kornia: via resample
    - Albumentations: via interpolation
- Key differences:
  1. Implementation options:
    - Kornia:
      * cropping_mode ('slice' or 'resample')
      * align_corners parameter
      * keepdim for batch form
    - Albumentations:
      * Separate mask interpolation
      * Fallback to center crop after 10 attempts
  2. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomRotation90 RandomRotate90 - Similar core functionality: rotate image by 90 degrees
- Key similarities:
  1. Both have default probability p=0.5
  2. Both rotate in 90-degree increments
- Key differences:
  1. Rotation control:
    - Kornia:
      * times parameter to specify range of rotations
      * resample and align_corners for interpolation
    - Albumentations:
      * Simpler implementation (0-3 rotations)
  2. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomRotation Rotate - Similar core functionality: rotate image by random angle
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support different interpolation methods
- Key differences:
  1. Angle specification:
    - Kornia: degrees parameter (if single value, range is (-degrees, +degrees))
    - Albumentations: limit parameter (default: (-90, 90))
  2. Additional features:
    - Kornia:
      * align_corners for interpolation
    - Albumentations:
      * Border mode options
      * Fill values for padding
      * rotate_method for bboxes
      * crop_border option
      * Separate mask interpolation
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints
RandomShear Affine (shear parameter) - Similar core functionality: apply shear transformation
- Key similarities:
  1. Both have default probability p=0.5
  2. Both support different interpolation methods
  3. Both support independent x/y shear control
- Key differences:
  1. Parameter specification:
    - Kornia:
      * Dedicated shear transform
      * shear parameter supports float, tuple(2), or tuple(4)
      * Simple padding modes (zeros, border, reflection)
    - Albumentations:
      * Part of general Affine transform
      * shear supports number, tuple, or dict format
      * More border modes and fill options
  2. Additional features in Albumentations:
    * Separate mask interpolation
    * fit_output option
    * Combined with other affine transforms
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, keypoints, bboxes
RandomThinPlateSpline ThinPlateSpline - Similar core functionality: apply smooth, non-rigid deformations
- Key similarities:
  1. Both have default probability p=0.5
  2. Both use thin plate spline algorithm
  3. Both support interpolation options
- Key differences:
  1. Deformation control:
    - Kornia:
      * Single scale parameter (default: 0.2)
      * Fixed control point grid
    - Albumentations:
      * scale_range tuple for range of deformation
      * Configurable num_control_points
  2. Implementation details:
    - Kornia:
      * align_corners parameter
      * Binary mode choice (bilinear/nearest)
    - Albumentations:
      * OpenCV interpolation flags
      * More granular control over grid
  3. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, keypoints, bboxes
RandomVerticalFlip VerticalFlip - Similar core functionality: flip image vertically
- Key similarities:
  1. Both have default probability p=0.5
  2. Simple operation with same visual result
- Key differences:
  1. Implementation:
    - Kornia:
      * Additional p_batch parameter
    - Albumentations:
      * Simpler implementation
  2. Target handling:
    - Kornia: Image tensors only
    - Albumentations: Images, masks, bboxes, keypoints

Key Differences

Compared to TorchVision

  • Albumentations operates on numpy arrays instead of PyTorch tensors
  • Albumentations typically provides more parameters for fine-tuning transformations
  • Most Albumentations transforms support both image and mask augmentation
  • Better support for bounding box and keypoint augmentation

Compared to Kornia

  • Kornia operates directly on GPU tensors, while Albumentations works with numpy arrays
  • Albumentations provides more comprehensive support for object detection and segmentation tasks
  • Albumentations typically offers better performance for CPU-based augmentations

Performance Comparison

According to benchmarking results, Albumentations generally offers superior CPU performance compared to TorchVision and Kornia for most transforms. Here are some key highlights: Common Transforms Performance (images/second, higher is better)

Transform Albumentations TorchVision Kornia Notes
HorizontalFlip 8,618 914 390 Albumentations is ~9x faster than TorchVision, ~22x faster than Kornia
VerticalFlip 22,847 3,198 1,212 Albumentations is ~7x faster than TorchVision, ~19x faster than Kornia
RandomResizedCrop 2,828 511 287 Albumentations is ~5.5x faster than TorchVision, ~10x faster than Kornia
Normalize 1,196 519 626 Albumentations is ~2x faster than both
ColorJitter 628 46 55 Albumentations is ~13x faster than both

Key Performance Insights:

  • Basic Operations: Albumentations excels at basic transforms like flips and crops, often being 5-20x faster than alternatives
  • Complex Operations: For more complex transforms like elastic deformation, the performance gap narrows
  • Memory Efficiency: Working with numpy arrays (Albumentations) is generally more memory efficient than tensor operations (Kornia/TorchVision) on CPU

When to Choose Each Library:

  • Albumentations: Best choice for CPU-based preprocessing pipelines and when maximum performance is needed
  • Kornia: Consider when doing augmentation on GPU with existing PyTorch tensors
  • TorchVision: Good choice when deeply integrated into PyTorch ecosystem and GPU performance isn't critical

Note: Benchmarks performed on macOS-15.0.1-arm64 with Python 3.12.7. Your results may vary based on hardware and setup.

Code Examples

TorchVision to Albumentations

Python
# TorchVision
transforms = T.Compose([
    T.RandomHorizontalFlip(p=0.5),
    T.RandomRotation(10),
    T.Normalize(mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225])
])

# Albumentations equivalent
transforms = A.Compose([
    A.HorizontalFlip(p=0.5),
    A.Rotate(limit=10),
    A.Normalize(mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225])
])

Kornia to Albumentations

Python
# Kornia
transforms = K.AugmentationSequential(
    K.RandomHorizontalFlip(p=0.5),
    K.RandomRotation(degrees=10),
    K.Normalize(mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225])
)

# Albumentations equivalent
transforms = A.Compose([
    A.HorizontalFlip(p=0.5),
    A.Rotate(limit=10),
    A.Normalize(mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225])
])

Additional Resources