Transform Library Comparison Guide 🔗
This guide helps you find equivalent transforms between Albumentations and other popular libraries (torchvision and Kornia).
Key Differences 🔗
Compared to TorchVision 🔗
- Albumentations operates on numpy arrays (TorchVision uses PyTorch tensors)
- More parameters for fine-tuning transformations
- Built-in support for mask augmentation
- Better handling of bounding boxes and keypoints
Compared to Kornia 🔗
- CPU-based numpy operations (Kornia uses GPU tensors)
- More comprehensive support for detection/segmentation
- Generally better CPU performance
- Simpler API for common tasks
Common Transform Mappings 🔗
Basic Geometric Transforms 🔗
TorchVision Transform | Albumentations Equivalent | Notes |
---|---|---|
Resize | Resize / LongestMaxSize | - TorchVision's Resize combines two Albumentations behaviors:  1. When given (h,w): equivalent to Albumentations Resize   2. When given single int + max_size: similar to LongestMaxSize - Albumentations allows separate interpolation method for masks - TorchVision has antialias parameter, Albumentations doesn't |
ScaleJitter | OneOf + multiple Resize | - Can be approximated in Albumentations using OneOf container with multiple Resize transforms - Example: Â Â transforms = A.OneOf([ Â Â Â Â A.Resize(height=int(target_h * scale), width=int(target_w * scale)) Â Â Â Â for scale in np.linspace(0.1, 2.0, num=20) Â Â ]) - Not exactly the same as continuous random scaling, but provides similar functionality |
RandomShortestSize | OneOf + SmallestMaxSize | - Can be approximated in Albumentations using: Â Â transforms = A.OneOf([ Â Â Â Â A.SmallestMaxSize(max_size=size, max_height=max_size, max_width=max_size) Â Â Â Â for size in [480, 512, 544, 576, 608] Â Â ]) - Randomly selects size for shortest side while maintaining aspect ratio - Optional max_size parameter limits longest side- TorchVision has antialias parameter, Albumentations doesn't |
RandomResize | OneOf + Resize | - TorchVision: randomly selects single size S between min_size and max_size , sets both width and height to S - No direct equivalent in Albumentations (RandomScale preserves aspect ratio) - Can be approximated using: Â Â transforms = A.OneOf([ Â Â Â Â A.Resize(size, size) Â Â Â Â for size in range(min_size, max_size + 1, step) Â Â ]) |
RandomCrop | RandomCrop | - Both perform random cropping with similar core functionality - Key differences:   1. TorchVision accepts single int for square crop, Albumentations requires both height and width   2. Padding options differ:     - TorchVision: supports padding parameter for pre-padding     - Albumentations: offers pad_position parameter ('center', 'top_left', etc.)  3. Fill value handling:     - TorchVision: supports dict mapping for different types     - Albumentations: separate fill and fill_mask parameters  4. Padding modes:     - TorchVision: 'constant', 'edge', 'reflect', 'symmetric'     - Albumentations: uses OpenCV border modes |
RandomResizedCrop | RandomResizedCrop | - Nearly identical functionality and parameters - Key differences:   1. TorchVision accepts single int for square output, Albumentations requires (height, width) tuple  2. Default values are the same (scale=(0.08, 1.0), ratio=(0.75, 1.3333))   3. Albumentations adds:     - Separate mask_interpolation parameter    - Probability parameter p |
RandomIoUCrop | RandomSizedBBoxSafeCrop | - Both ensure safe cropping with respect to bounding boxes - Key differences:   1. TorchVision:     - Implements exact SSD paper approach     - Uses IoU-based sampling strategy     - Requires explicit sanitization of boxes after crop   2. Albumentations:     - Simpler approach ensuring bbox safety     - Directly specifies target size     - Automatically handles bbox cleanup - For exact SSD-style cropping, might need custom implementation in Albumentations |
CenterCrop | CenterCrop | - Both crop the center part of the input - Key differences:   1. Size specification:     - TorchVision: accepts single int for square crop or (height, width) tuple     - Albumentations: requires separate height and width parameters   2. Padding behavior:     - TorchVision: always pads with 0 if image is smaller     - Albumentations: optional padding with pad_if_needed   3. Albumentations adds:     - Configurable padding mode and position     - Separate fill values for image and mask     - Probability parameter p |
RandomHorizontalFlip | HorizontalFlip | - Identical functionality - Both have default probability p=0.5 - Only naming difference: TorchVision includes "Random" in name |
RandomVerticalFlip | VerticalFlip | - Identical functionality - Both have default probability p=0.5 - Only naming difference: TorchVision includes "Random" in name |
Pad | Pad | - Similar core padding functionality - Both support:   - Single int for all sides   - (pad_x, pad_y) for symmetric padding   - (left, top, right, bottom) for per-side padding - Key differences:   1. Padding modes:     - TorchVision: 'constant', 'edge', 'reflect', 'symmetric'     - Albumentations: uses OpenCV border modes   2. Fill value handling:     - TorchVision: supports dict mapping for different types     - Albumentations: separate fill and fill_mask parameters  3. Albumentations adds:     - Probability parameter p |
RandomZoomOut | RandomScale + PadIfNeeded | - No direct equivalent in Albumentations - Can be approximated by combining:   A.Compose([     A.RandomScale(scale_limit=(0.0, 3.0), p=0.5), # scale_limit=(0.0, 3.0) maps to side_range=(1.0, 4.0)     A.PadIfNeeded(min_height=height, min_width=width, border_mode=cv2.BORDER_CONSTANT, value=fill)   ]) - Key differences:   1. TorchVision implements specific SSD paper approach   2. Albumentations requires composition of two transforms |
RandomRotation | Rotate | - Similar core rotation functionality but with different parameters - Key differences:   1. Angle specification:     - TorchVision: degrees parameter (-degrees, +degrees) or (min, max)    - Albumentations: limit parameter (-limit, +limit) or (min, max)  2. Output size control:     - TorchVision: expand=True/False     - Albumentations: crop_border=True/False   3. Additional Albumentations features:     - Separate mask interpolation     - Bbox rotation methods ('largest_box' or 'ellipse')     - More border modes     - Probability parameter p   4. Center specification:     - TorchVision: supports custom center point     - Albumentations: always uses image center |
RandomAffine | Affine | - Both support core affine operations (translation, rotation, scale, shear) - Key differences:   1. Parameter specification:     - TorchVision: single parameters for each transform     - Albumentations: more flexible with dict options for x/y axes   2. Scale handling:     - Albumentations adds keep_ratio and balanced_scale     - Albumentations supports independent x/y scaling   3. Translation:     - TorchVision: fraction only     - Albumentations: both percent and pixels   4. Additional Albumentations features:     - fit_output to adjust image plane    - Separate mask interpolation     - More border modes     - Bbox rotation methods     - Probability parameter p |
RandomPerspective | Perspective | - Both apply random perspective transformations - Key differences:   1. Distortion control:     - TorchVision: single distortion_scale (0 to 1)    - Albumentations: scale tuple for corner movement range  2. Output handling:     - Albumentations adds keep_size and fit_output options    - Can control whether to maintain original size   3. Additional Albumentations features:     - Separate mask interpolation     - More border modes     - Better control over output size and fitting |
ElasticTransform | ElasticTransform | - Similar core functionality: both apply elastic deformations to images - Key differences:   1. Parameters have opposite meanings:     - TorchVision: alpha (displacement), sigma (smoothness)    - Albumentations: alpha (smoothness), sigma (displacement)  2. Default values reflect this difference:     - TorchVision: alpha=50.0, sigma=5.0     - Albumentations: alpha=1.0, sigma=50.0 - Note on implementation:   - Albumentations follows Simard et al. 2003 paper more closely:     - σ should be ~0.05 * image_size     - α should be proportional to σ - Additional Albumentations features:   - approximate mode  - same_dxdy option  - Choice of noise distribution   - Separate mask interpolation |
ColorJitter | ColorJitter | - Similar core functionality: both randomly adjust brightness, contrast, saturation, and hue - Key similarities:   1. Same parameter names and meanings   2. Same value ranges (e.g., hue should be in [-0.5, 0.5])   3. Random order of transformations - Key differences:   1. Default values:     - TorchVision: all None by default     - Albumentations: defaults to (0.8, 1.2) for brightness/contrast/saturation   2. Implementation:     - TorchVision: uses Pillow     - Albumentations: uses OpenCV (may produce slightly different results)   3. Additional in Albumentations:     - Explicit probability parameter p     - Value saturation instead of uint8 overflow |
RandomChannelPermutation | ChannelShuffle | - Both randomly permute image channels - Key similarities:   1. Same core functionality   2. Work on multi-channel images (typically RGB) - Key differences:   1. Naming convention only   2. Albumentations adds:     - Probability parameter p |
RandomPhotometricDistort | RandomOrder + ColorJitter + ChannelShuffle | - TorchVision's transform is from SSD paper, combines: Â Â 1. Color jittering (brightness, contrast, saturation, hue) Â Â 2. Random channel permutation - Can be replicated in Albumentations using: Â Â A.RandomOrder([ Â Â Â Â A.ColorJitter(brightness=(0.875, 1.125), Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â contrast=(0.5, 1.5), Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â saturation=(0.5, 1.5), Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â hue=(-0.05, 0.05), Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â p=0.5), Â Â Â Â A.ChannelShuffle(p=0.5) Â Â ]) |
Grayscale | ToGray | - Similar core functionality: convert RGB to grayscale - Key differences:   1. Output channels:     - TorchVision: only 1 or 3 channels     - Albumentations: supports any number of output channels   2. Conversion methods:     - TorchVision: single method (weighted RGB)     - Albumentations: multiple methods via method parameter:      • weighted_average (default, same as TorchVision)       • from_lab, desaturation, average, max, pca   3. Additional in Albumentations:     - Probability parameter p     - More flexible channel handling |
RGB | ToRGB | - Similar core functionality: convert to RGB format - Key differences:   1. Input handling:     - TorchVision: accepts 1 or 3 channel inputs     - Albumentations: only accepts single-channel inputs   2. Output channels:     - TorchVision: always 3 channels     - Albumentations: configurable via num_output_channels   3. Behavior:     - TorchVision: converts to RGB if not already RGB     - Albumentations: strictly grayscale to RGB conversion   4. Additional in Albumentations:     - Probability parameter p |
RandomGrayscale | ToGray | - Similar core functionality: convert to grayscale with probability - Key differences:   1. Default probability:     - TorchVision: p=0.1     - Albumentations: p=0.5   2. Output handling:     - TorchVision: always preserves input channels     - Albumentations: configurable output channels   3. Conversion methods:     - TorchVision: single method     - Albumentations: multiple methods with different channel support:       • weighted_average, from_lab: 3-channel only       • desaturation, average, max, pca: any number of channels   4. Channel requirements:     - TorchVision: works with 1 or 3 channels     - Albumentations: depends on method chosen |
GaussianBlur | GaussianBlur | - Similar core functionality: apply Gaussian blur with random kernel size - Key similarities:   1. Both support random kernel sizes   2. Both support random sigma values - Key differences:   1. Parameter specification:     - TorchVision: kernel_size (exact size), sigma (range)    - Albumentations: blur_limit (size range), sigma_limit (range)  2. Kernel size constraints:     - TorchVision: must specify exact size     - Albumentations: can specify range (3, 7) or auto-compute   3. Additional in Albumentations:     - Probability parameter p     - Auto-computation of kernel size from sigma |
GaussianNoise | GaussNoise | - Similar core functionality: add Gaussian noise to images - Key similarities:   1. Both support mean and standard deviation parameters - Key differences:   1. Parameter ranges:     - TorchVision: fixed values for mean and sigma     - Albumentations: ranges for both ( std_range , mean_range )  2. Value handling:     - TorchVision: expects float [0,1], has clip option     - Albumentations: auto-scales based on dtype   3. Additional in Albumentations:     - Per-channel noise option     - Noise scale factor for performance     - Probability parameter p |
RandomInvert | InvertImg | - Similar core functionality: invert image colors - Key similarities:   1. Both invert pixel values   2. Both have default probability of 0.5 - Key differences:   1. Value handling:     - TorchVision: works with [0,1] float tensors     - Albumentations: auto-handles uint8 (255) and float32 (1.0) |
RandomPosterize | Posterize | - Similar core functionality: reduce color bits - Key similarities:   1. Both posterize images with probability p=0.5 - Key differences:   1. Bits specification:     - TorchVision: single fixed value [0-8]     - Albumentations: flexible options with [1-7] (recommended):       • Single value for all channels       • Range (min_bits, max_bits)       • Per-channel values [r,g,b]       • Per-channel ranges [(r_min,r_max), ...]   2. Practical range:     - TorchVision: includes 0 (black) and 8 (unchanged)     - Albumentations: recommended [1-7] for actual posterization |
RandomSolarize | Solarize | - Similar core functionality: invert pixels above threshold - Key similarities:   1. Both have default probability p=0.5   2. Both invert values above threshold - Key differences:   1. Threshold specification:     - TorchVision: single fixed threshold value     - Albumentations: range via threshold_range   2. Value handling:     - TorchVision: works with raw threshold values     - Albumentations: uses normalized [0,1] range:       • uint8: multiplied by 255       • float32: multiplied by 1.0 |
RandomAdjustSharpness | Sharpen | - Similar core functionality: adjust image sharpness - Key similarities:   1. Both have default probability p=0.5 - Key differences:   1. Parameter specification:     - TorchVision: single sharpness_factor       • 0: blurred       • 1: original       • 2: doubled sharpness     - Albumentations: more controls:       • alpha : effect visibility [0,1]      • lightness : contrast control  2. Method options:     - TorchVision: single method     - Albumentations: two methods:       • 'kernel': Laplacian operator       • 'gaussian': blur interpolation |
RandomAutocontrast | AutoContrast | Same core functionality with identical parameters (p=0.5) |
RandomEqualize | Equalize | - Similar core functionality: histogram equalization - Key similarities:   1. Both have default probability p=0.5 - Key differences:   1. Additional Albumentations features:     - Choice of algorithm (cv/pil methods)     - Per-channel or luminance-based equalization     - Optional masking support |
Normalize | Normalize | - Similar core functionality: normalize image values - Key similarities:   1. Both support mean/std normalization - Key differences:   1. Normalization options:     - TorchVision: only (input - mean) / std     - Albumentations: multiple methods:       • standard (same as TorchVision)       • image (global stats)       • image_per_channel       • min_max       • min_max_per_channel   2. Additional in Albumentations:     - max_pixel_value parameter    - Probability parameter p |
RandomErasing | Erasing | - Similar core functionality: randomly erase image regions - Key similarities:   1. Both have default probability p=0.5   2. Same default scale=(0.02, 0.33)   3. Same default ratio=(0.3, 3.3) - Key differences:   1. Fill value options:     - TorchVision: number/tuple or 'random'     - Albumentations: additional options:       • random_uniform       • inpaint_telea       • inpaint_ns   2. Additional in Albumentations:     - Mask fill value option     - Support for masks, bboxes, keypoints |
JPEG | ImageCompression | - Similar core functionality: apply JPEG compression - Key similarities:   1. Both use quality range 1-100   2. Both support quality ranges - Key differences:   1. Compression types:     - TorchVision: JPEG only     - Albumentations: JPEG and WebP   2. Additional in Albumentations:     - Probability parameter p     - Default quality range (99, 100) |
Kornia to Albumentations 🔗
Kornia | Albumentations | Notes |
---|---|---|
ColorJitter | ColorJitter | - Similar core functionality: randomly adjust brightness, contrast, saturation, and hue - Key similarities:   1. Both support same parameters (brightness, contrast, saturation, hue)   2. Both allow float or tuple ranges for parameters - Key differences:   1. Default values:     - Albumentations: (0.8, 1.2) for brightness/contrast/saturation     - Kornia: 0.0 for all parameters   2. Default probability:     - Albumentations: p=0.5     - Kornia: p=1.0   3. Note: Kornia recommends using ColorJiggle instead as it follows color theory better |
RandomAutoContrast | AutoContrast | - Similar core functionality: enhance image contrast automatically - Key similarities:   1. Both stretch intensity range to use full range   2. Both preserve relative intensities - Key differences:   1. Default probability:     - Albumentations: p=0.5     - Kornia: p=1.0   2. Additional in Kornia:     - clip_output parameter to control value clipping |
RandomBoxBlur | Blur | - Similar core functionality: apply box/average blur to images - Key similarities: Â Â 1. Both have default probability p=0.5 Â Â 2. Both apply box/average blur filter - Key differences: Â Â 1. Kernel size specification: Â Â Â Â - Albumentations: blur_limit parameter for range (e.g., (3, 7))Â Â Â Â - Kornia: fixed kernel_size tuple (default (3, 3))Â Â 2. Additional in Kornia: Â Â Â Â - border_type parameter ('reflect', 'replicate', 'circular')Â Â Â Â - normalized parameter for L1 norm control |
RandomBrightness | RandomBrightnessContrast | - Different scope:   - Kornia: brightness only   - Albumentations: combines brightness and contrast - Key differences:   1. Parameter specification:     - Kornia: brightness tuple (default: (1.0, 1.0))    - Albumentations: brightness_limit (default: (-0.2, 0.2))  2. Default probability:     - Kornia: p=1.0     - Albumentations: p=0.5   3. Additional in Albumentations:     - brightness_by_max parameter for adjustment method    - ensure_safe_range to prevent overflow/underflow    - Combined contrast control   4. Additional in Kornia:     - clip_output parameter |
RandomChannelDropout | ChannelDropout | - Similar core functionality: randomly drop image channels - Key similarities:   1. Both have default probability p=0.5   2. Both allow specifying fill value for dropped channels - Key differences:   1. Channel drop specification:     - Kornia: fixed num_drop_channels (default: 1)    - Albumentations: flexible channel_drop_range tuple (default: (1, 1))  2. Error handling:     - Albumentations: explicit checks for single-channel images and invalid ranges     - Kornia: simpler parameter validation |
RandomChannelShuffle | ChannelShuffle | - Identical core functionality: randomly shuffle image channels |
RandomClahe | CLAHE | - Similar core functionality: apply Contrast Limited Adaptive Histogram Equalization - Key similarities: Â Â 1. Both have default probability p=0.5 Â Â 2. Both allow configuring grid size and clip limit - Key differences: Â Â 1. Parameter defaults: Â Â Â Â - Kornia: clip_limit=(40.0, 40.0), grid_size=(8, 8) Â Â Â Â - Albumentations: clip_limit=(1, 4), tile_grid_size=(8, 8) Â Â 2. Additional in Kornia: Â Â Â Â - slow_and_differentiable parameter for implementation choice |
RandomContrast | RandomBrightnessContrast | - Different scope:   - Kornia: contrast only   - Albumentations: combines brightness and contrast - Key differences:   1. Parameter specification:     - Kornia: contrast tuple (default: (1.0, 1.0))    - Albumentations: contrast_limit (default: (-0.2, 0.2))  2. Default probability:     - Kornia: p=1.0     - Albumentations: p=0.5   3. Additional in Albumentations:     - ensure_safe_range to prevent overflow/underflow    - Combined brightness control   4. Additional in Kornia:     - clip_output parameter |
RandomEqualize | Equalize | - Similar core functionality: apply histogram equalization - Key similarities:   1. Both have default probability p=0.5 - Key differences:   1. Additional in Albumentations:     - mode parameter to choose between 'cv' and 'pil' methods    - by_channels parameter for per-channel or luminance-based equalization    - mask parameter to selectively apply equalization    - mask_params for dynamic mask generation |
RandomGamma | RandomGamma | - Similar core functionality: apply random gamma correction - Key differences:   1. Parameter specification:     - Kornia: separate gamma (1.0, 1.0) and gain (1.0, 1.0) tuples    - Albumentations: single gamma_limit (80, 120) as percentage range  2. Default probability:     - Kornia: p=1.0     - Albumentations: p=0.5   3. Additional in Albumentations:     - eps parameter to prevent numerical errors |
RandomGaussianBlur | GaussianBlur | - Similar core functionality: apply Gaussian blur with random parameters - Key similarities:   1. Both have default probability p=0.5   2. Both support kernel size and sigma parameters - Key differences:   1. Parameter specification:     - Kornia: requires explicit kernel_size and sigma range    - Albumentations: blur_limit (default: (3, 7)) and sigma_limit (default: 0)  2. Additional in Kornia:     - border_type parameter for padding mode    - separable parameter for 1D convolution optimization |
RandomGaussianIllumination | Illumination | - Similar core functionality: apply illumination effects - Key similarities:   1. Both have default probability p=0.5   2. Both support controlling effect intensity and position - Key differences:   1. Scope:     - Kornia: Gaussian illumination patterns only     - Albumentations: Multiple modes (linear, corner, gaussian)   2. Parameter ranges:     - Kornia: gain=(0.01, 0.15), center=(0.1, 0.9), sigma=(0.2, 1.0)     - Albumentations: intensity_range=(0.01, 0.2), center_range=(0.1, 0.9), sigma_range=(0.2, 1.0)   3. Additional in Albumentations:     - mode parameter for different effect types    - effect_type for brighten/darken control    - angle_range for linear gradients  4. Additional in Kornia:     - sign parameter for effect direction |
RandomGaussianNoise | GaussNoise | - Similar core functionality: add Gaussian noise to images - Key similarities:   1. Both have default probability p=0.5 - Key differences:   1. Parameter specification:     - Kornia: fixed mean (default: 0.0) and std (default: 1.0)    - Albumentations: ranges via std_range (0.2, 0.44) and mean_range (0.0, 0.0)  2. Additional in Albumentations:     - per_channel parameter for independent channel noise    - noise_scale_factor for performance optimization    - Automatic value scaling based on image dtype |
RandomGrayscale | ToGray | - Similar core functionality: convert images to grayscale - Key differences:   1. Default probability:     - Kornia: p=0.1     - Albumentations: p=0.5   2. Conversion options:     - Kornia: customizable rgb_weights for channel mixing    - Albumentations: multiple method options (weighted_average, from_lab, desaturation, average, max, pca)  3. Output control:     - Kornia: always 3-channel output     - Albumentations: configurable num_output_channels |
RandomHue | ColorJitter (hue parameter) | - Similar core functionality: adjust image hue - Key differences:   1. Scope:     - Kornia: hue-only transform     - Albumentations: part of ColorJitter with brightness, contrast, and saturation   2. Default values:     - Kornia: hue=(0.0, 0.0), p=1.0     - Albumentations: hue=(-0.5, 0.5), p=0.5 |
RandomInvert | InvertImg | - Similar core functionality: invert image values - Key differences: Â Â 1. Maximum value handling: Â Â Â Â - Kornia: configurable via max_val parameter (default: 1.0)Â Â Â Â - Albumentations: automatically determined by dtype (255 for uint8, 1.0 for float32) |
RandomJPEG | ImageCompression | - Similar core functionality: apply image compression - Key differences:   1. Compression options:     - Kornia: JPEG only     - Albumentations: supports both JPEG and WebP   2. Quality specification:     - Kornia: jpeg_quality (default: 50.0)    - Albumentations: quality_range (default: (99, 100))  3. Default probability:     - Kornia: p=1.0     - Albumentations: p=0.5 |
RandomLinearCornerIllumination | Illumination (corner mode) | - Similar core functionality: apply corner illumination effects - Key differences:   1. Scope:     - Kornia: corner illumination only     - Albumentations: part of general Illumination transform with multiple modes   2. Parameter specification:     - Kornia: gain (0.01, 0.2) and sign (-1.0, 1.0)    - Albumentations: intensity_range (0.01, 0.2) and effect_type (brighten/darken/both)  3. Additional in Albumentations:     - Multiple illumination modes (linear, corner, gaussian)     - More control over effect parameters |
RandomLinearIllumination | Illumination (linear mode) | - Similar core functionality: apply linear illumination effects - Key differences:   1. Scope:     - Kornia: linear illumination only     - Albumentations: part of general Illumination transform with multiple modes   2. Parameter specification:     - Kornia: gain (0.01, 0.2) and sign (-1.0, 1.0)    - Albumentations: intensity_range (0.01, 0.2), effect_type (brighten/darken/both), and angle_range (0, 360)  3. Additional in Albumentations:     - Multiple illumination modes (linear, corner, gaussian)     - Explicit angle control for gradient direction |
RandomMedianBlur | MedianBlur | - Similar core functionality: apply median blur filter - Key similarities: Â Â 1. Both have default probability p=0.5 - Key differences: Â Â 1. Kernel size specification: Â Â Â Â - Kornia: fixed kernel_size tuple (default: (3, 3))Â Â Â Â - Albumentations: range via blur_limit (default: (3, 7))Â Â 2. Kernel constraints: Â Â Â Â - Albumentations: enforces odd kernel sizes |
RandomMotionBlur | MotionBlur | - Similar core functionality: apply directional motion blur - Key similarities:   1. Both have default probability p=0.5   2. Both support angle and direction control - Key differences:   1. Kernel size specification:     - Kornia: kernel_size as int or tuple    - Albumentations: blur_limit (default: (3, 7))  2. Angle control:     - Kornia: angle parameter with symmetric range (-angle, angle)    - Albumentations: angle_range (default: (0, 360))  3. Additional in Albumentations:     - allow_shifted parameter for kernel position control |
RandomPlanckianJitter | PlanckianJitter | - Similar core functionality: apply physics-based color temperature variations - Key similarities:   1. Both have default probability p=0.5   2. Both support 'blackbody' and 'cied' modes - Key differences:   1. Temperature control:     - Kornia: select_from parameter for discrete jitter selection    - Albumentations: temperature_limit for continuous range  2. Additional in Albumentations:     - sampling_method parameter ('uniform' or 'gaussian')    - More detailed control over temperature ranges     - Better documentation of physics-based effects |
RandomPlasmaBrightness | PlasmaBrightnessContrast | - Similar core functionality: apply fractal-based brightness adjustments - Key similarities:   1. Both have default probability p=0.5   2. Both use Diamond-Square algorithm for pattern generation - Key differences:   1. Parameter specification:     - Kornia: roughness (0.1, 0.7) and intensity (0.0, 1.0)    - Albumentations: brightness_range (-0.3, 0.3), contrast_range (-0.3, 0.3), roughness (default: 3.0)  2. Additional in Albumentations:     - Combined brightness and contrast adjustment     - plasma_size parameter for pattern detail control    - More detailed mathematical formulation and documentation |
RandomPlasmaContrast | PlasmaBrightnessContrast | - Similar core functionality: apply fractal-based contrast adjustments - Key similarities:   1. Both have default probability p=0.5   2. Both use Diamond-Square algorithm for pattern generation - Key differences:   1. Parameter specification:     - Kornia: roughness (0.1, 0.7) only    - Albumentations: contrast_range (-0.3, 0.3), roughness (default: 3.0), plasma_size (default: 256)  2. Scope:     - Kornia: contrast-only adjustment     - Albumentations: combined brightness and contrast adjustment   3. Additional in Albumentations:     - More detailed mathematical formulation     - Pattern size control via plasma_size |
RandomPlasmaShadow | PlasmaShadow | - Similar core functionality: apply fractal-based shadow effects - Key similarities:   1. Both have default probability p=0.5   2. Both use Diamond-Square algorithm for pattern generation - Key differences:   1. Parameter specification:     - Kornia: roughness (0.1, 0.7), shade_intensity (-1.0, 0.0), shade_quantity (0.0, 1.0)    - Albumentations: shadow_intensity_range (0.3, 0.7), plasma_size (default: 256), roughness (default: 3.0)  2. Additional in Albumentations:     - Pattern size control via plasma_size     - More intuitive intensity range (0 to 1)     - More detailed mathematical formulation and documentation |
RandomPosterize | Posterize | - Similar core functionality: reduce color bits in image - Key similarities:   1. Both have default probability p=0.5   2. Both operate on color bit reduction - Key differences:   1. Bit specification:     - Kornia: bits parameter (default: 3) with range (0, 8], can be float or tuple    - Albumentations: num_bits parameter (default: 4) with range [1, 7], supports multiple formats:      * Single int for all channels       * Tuple for random range       * List for per-channel specification       * List of tuples for per-channel ranges   2. Additional in Albumentations:     - More flexible channel-wise control     - More detailed documentation and mathematical background |
RandomRain | RandomRain | - Similar core functionality: add rain effects to images - Key similarities:   1. Both have default probability p=0.5 - Key differences:   1. Rain parameter specification:     - Kornia: number_of_drops (1000, 2000), drop_height (5, 20), drop_width (-5, 5)    - Albumentations: slant_range (-10, 10), drop_length (20), drop_width (1)  2. Additional in Albumentations:     - drop_color customization    - blur_value for atmospheric effect    - brightness_coefficient for lighting adjustment    - rain_type presets (drizzle, heavy, torrential)  3. Approach:     - Kornia: Direct drop placement     - Albumentations: More realistic simulation with slant, blur, and brightness effects |
RandomRGBShift | AdditiveNoise | - Similar core functionality: add noise/shifts to image channels - Key similarities:   1. Both have default probability p=0.5   2. Both can affect individual channels - Key differences:   1. Approach:     - Kornia: Simple RGB channel shifts with individual limits     - Albumentations: More sophisticated noise generation with multiple distributions   2. Parameter specification:     - Kornia: r_shift_limit , g_shift_limit , b_shift_limit (all default: 0.5)    - Albumentations: Flexible noise configuration with:       * Multiple noise types (uniform, gaussian, laplace, beta)       * Different spatial modes (constant, per_pixel, shared)       * Customizable distribution parameters   3. Additional in Albumentations:     - Performance optimization options     - More detailed control over noise distribution     - Spatial application modes |
RandomSaltAndPepperNoise | SaltAndPepper | - Similar core functionality: apply salt and pepper noise to images - Key similarities:   1. Both have default probability p=0.5   2. Both use same default parameters:     - amount (0.01, 0.06)    - salt_vs_pepper (0.4, 0.6)- Key differences:   1. Parameter flexibility:     - Kornia: Supports single float or tuple for parameters     - Albumentations: Requires tuples for ranges   2. Documentation:     - Albumentations provides:       * Detailed mathematical formulation       * Clear examples for different noise levels       * Implementation notes and edge cases       * References to academic sources |
RandomSaturation | ColorJitter | - Different scope and functionality: - Key differences:   1. Scope:     - Kornia: Saturation-only adjustment     - Albumentations: Combined brightness, contrast, saturation, and hue adjustment   2. Default parameters:     - Kornia: saturation (1.0, 1.0), p=1.0    - Albumentations: saturation (0.8, 1.2), p=0.5  3. Implementation:     - Kornia: Aligns with PIL/TorchVision implementation     - Albumentations: Uses OpenCV with noted differences in HSV conversion   4. Additional in Albumentations:     - Brightness adjustment     - Contrast adjustment     - Hue adjustment     - Random order of transformations |
RandomSharpness | Sharpen | - Similar core functionality: sharpen images - Key similarities:   1. Both have default probability p=0.5 - Key differences:   1. Parameter specification:     - Kornia: Single sharpness parameter (default: 0.5)    - Albumentations: More detailed control with:       * alpha (0.2, 0.5) for effect visibility      * lightness (0.5, 1.0) for contrast      * method choice ('kernel' or 'gaussian')      * kernel_size and sigma for gaussian method  2. Implementation methods:     - Kornia: Single approach     - Albumentations: Two methods:       * Kernel-based using Laplacian operator       * Gaussian interpolation   3. Documentation:     - Albumentations provides detailed mathematical formulation and references |
RandomSnow | RandomSnow | - Similar core functionality: add snow effects to images - Key differences:   1. Parameter specification:     - Kornia: snow_coefficient (0.5, 0.5), brightness (2, 2), p=1.0    - Albumentations: snow_point_range (0.1, 0.3), brightness_coeff (2.5), p=0.5  2. Implementation methods:     - Kornia: Single approach     - Albumentations: Two methods:       * "bleach": Simple pixel value thresholding       * "texture": Advanced snow texture simulation   3. Additional in Albumentations:     - Detailed snow simulation with:       * HSV color space manipulation       * Gaussian noise for texture       * Depth effect simulation       * Sparkle effects   4. Documentation:     - Albumentations provides detailed mathematical formulation and implementation notes |
RandomSolarize | Solarize | - Similar core functionality: invert pixel values above threshold - Key similarities:   1. Both have default probability p=0.5 - Key differences:   1. Parameter specification:     - Kornia: Two parameters:       * thresholds (default: 0.1) for threshold range      * additions (default: 0.1) for value adjustment    - Albumentations: Single parameter:       * threshold_range (default: (0.5, 0.5))  2. Threshold handling:     - Kornia: Generates from (0.5 - x, 0.5 + x) for float input     - Albumentations: Direct range specification, scaled by image type max value   3. Documentation:     - Albumentations provides:       * Detailed examples for both uint8 and float32 images       * Clear mathematical formulation       * Image type-specific behavior explanation |
CenterCrop | CenterCrop | - Similar core functionality: crop center of image - Key similarities:   1. Both have default probability p=1.0 - Key differences:   1. Size specification:     - Kornia: Single size parameter (int or tuple)    - Albumentations: Separate height and width parameters  2. Additional features:     - Kornia:       * align_corners for interpolation      * resample mode selection      * cropping_mode ('slice' or 'resample')    - Albumentations:       * pad_if_needed for handling small images      * border_mode for padding method      * fill and fill_mask for padding values      * pad_position options  3. Target handling:     - Kornia: Image tensors only     - Albumentations: Supports images, masks, bboxes, and keypoints |
PadTo | PadIfNeeded | - Can achieve same core functionality - Key similarities:   1. Both have default probability p=1.0   2. Can pad to exact size:     - Kornia: size=(height, width)     - Albumentations: min_height=height, min_width=width - Key differences:   1. Parameter naming:     - Kornia: Single size tuple    - Albumentations: Separate dimension parameters   2. Additional features:     - Kornia:       * Simple pad_mode selection      * Single pad_value     - Albumentations:       * Flexible position options      * Separate fill and fill_mask       * Optional divisibility padding       * Multiple target support   3. Target handling:     - Kornia: Image tensors only     - Albumentations: Images, masks, bboxes, keypoints |
RandomAffine | Affine | - Similar core functionality: apply affine transformations - Key similarities:   1. Both have default probability p=0.5   2. Both support rotation, translation, scaling, and shear - Key differences:   1. Parameter specification:     - Kornia:       * degrees for rotation      * translate as fraction      * scale as tuple      * shear in degrees    - Albumentations:       * More flexible parameter formats       * Supports both percent and pixel translation       * Dictionary format for independent axis control   2. Additional features in Albumentations:     * fit_output for automatic size adjustment    * keep_ratio for aspect ratio preservation    * rotate_method options    * balanced_scale for even scale distribution  3. Target handling:     - Kornia: Image tensors only     - Albumentations: Images, masks, keypoints, bboxes |
RandomCrop | RandomCrop | - Similar core functionality: randomly crop image patches - Key similarities:   1. Both have default probability p=1.0   2. Both support padding if needed - Key differences:   1. Size specification:     - Kornia: Single size tuple (height, width)    - Albumentations: Separate height and width parameters  2. Padding options:     - Kornia:       * Flexible padding sizes (int, tuple[2], tuple[4])       * Multiple padding modes (constant, reflect, replicate)       * Single fill value     - Albumentations:       * Simpler padding interface       * Separate fill values for image and mask       * Flexible pad positioning   3. Target handling:     - Kornia: Image tensors only     - Albumentations: Images, masks, bboxes, keypoints |
RandomElasticTransform | ElasticTransform | - Similar core functionality: apply elastic deformations - Key similarities:   1. Both have default probability p=0.5   2. Both use Gaussian smoothing for displacement fields   3. Both support independent control of x/y deformations:     - Kornia: via separate values in sigma /alpha tuples    - Albumentations: via same_dxdy parameter- Key differences:   1. Parameter specification:     - Kornia:       * kernel_size tuple (63, 63)      * sigma tuple (32.0, 32.0)      * alpha tuple (1.0, 1.0)    - Albumentations:       * Single sigma (default: 50.0)      * Single alpha (default: 1.0)  2. Additional features:     - Kornia:       * Control over padding mode     - Albumentations:       * approximate mode for faster processing      * Choice of noise distribution       * Separate mask interpolation   3. Target handling:     - Kornia: Image tensors only     - Albumentations: Images, masks, bboxes, keypoints |
RandomErasing | Erasing | - Similar core functionality: randomly erase rectangular regions - Key similarities:   1. Both have default probability p=0.5   2. Same default parameters:     * scale (0.02, 0.33)    * ratio (0.3, 3.3)- Key differences:   1. Fill value options:     - Kornia: Simple numeric value (default: 0.0)    - Albumentations: Rich fill options:      * Numeric values       * "random" per pixel       * "random_uniform" per region       * "inpaint_telea" method       * "inpaint_ns" method   2. Additional features in Albumentations:     * Separate mask_fill value    * Support for masks, bboxes, keypoints     * Inpainting options for more natural-looking results   3. Target handling:     - Kornia: Image tensors only     - Albumentations: Images, masks, bboxes, keypoints |
RandomFisheye | OpticalDistortion | - Similar core functionality: apply optical/fisheye distortion - Key similarities:   1. Both have default probability p=0.5   2. Both support fisheye distortion - Key differences:   1. Parameter specification:     - Kornia:       * Separate center_x , center_y for distortion center      * gamma for distortion strength    - Albumentations:       * Single distort_limit parameter      * mode selection ('camera' or 'fisheye')  2. Distortion models:     - Kornia: Fisheye only     - Albumentations:       * Camera matrix model       * Fisheye model   3. Additional features in Albumentations:     * Separate interpolation methods for image and mask     * Support for masks, bboxes, keypoints   4. Target handling:     - Kornia: Image tensors only     - Albumentations: Images, masks, bboxes, keypoints |
RandomHorizontalFlip | HorizontalFlip | - Similar core functionality: flip image horizontally - Key similarities:   1. Both have default probability p=0.5   2. Simple operation with same visual result - Key differences:   1. Batch handling:     - Kornia:       * Additional p_batch parameter      * same_on_batch option    - Albumentations: No batch-specific parameters   2. Target handling:     - Kornia: Image tensors only     - Albumentations: Images, masks, bboxes, keypoints |
RandomPerspective | Perspective | - Similar core functionality: apply perspective transformation - Key similarities:   1. Both have default probability p=0.5   2. Both transform image by moving corners   3. Both support different interpolation methods:     - Kornia: via resample (BILINEAR, NEAREST)    - Albumentations: via interpolation (INTER_LINEAR, INTER_NEAREST, etc.)- Key differences:   1. Distortion control:     - Kornia:       * distortion_scale (0 to 1, default: 0.5)      * sampling_method ('basic' or 'area_preserving')    - Albumentations:       * scale tuple for corner movement range      * fit_output option for image capture  2. Output handling:     - Kornia:       * align_corners parameter      * keepdim for batch form    - Albumentations:       * keep_size for output dimensions      * Border mode and fill options       * Separate mask interpolation   3. Target handling:     - Kornia: Image tensors only     - Albumentations: Images, masks, keypoints, bboxes |
RandomResizedCrop | RandomResizedCrop | - Similar core functionality: crop random patches and resize - Key similarities:   1. Both have default probability p=1.0   2. Same default parameters:     * scale (0.08, 1.0)    * ratio (~0.75, ~1.33)  3. Both support different interpolation methods:     - Kornia: via resample     - Albumentations: via interpolation - Key differences:   1. Implementation options:     - Kornia:       * cropping_mode ('slice' or 'resample')      * align_corners parameter      * keepdim for batch form    - Albumentations:       * Separate mask interpolation       * Fallback to center crop after 10 attempts   2. Target handling:     - Kornia: Image tensors only     - Albumentations: Images, masks, bboxes, keypoints |
RandomRotation90 | RandomRotate90 | - Similar core functionality: rotate image by 90 degrees - Key similarities:   1. Both have default probability p=0.5   2. Both rotate in 90-degree increments - Key differences:   1. Rotation control:     - Kornia:       * times parameter to specify range of rotations      * resample and align_corners for interpolation    - Albumentations:       * Simpler implementation (0-3 rotations)   2. Target handling:     - Kornia: Image tensors only     - Albumentations: Images, masks, bboxes, keypoints |
RandomRotation | Rotate | - Similar core functionality: rotate image by random angle - Key similarities:   1. Both have default probability p=0.5   2. Both support different interpolation methods - Key differences:   1. Angle specification:     - Kornia: degrees parameter (if single value, range is (-degrees, +degrees))    - Albumentations: limit parameter (default: (-90, 90))  2. Additional features:     - Kornia:       * align_corners for interpolation    - Albumentations:       * Border mode options       * Fill values for padding       * rotate_method for bboxes      * crop_border option      * Separate mask interpolation   3. Target handling:     - Kornia: Image tensors only     - Albumentations: Images, masks, bboxes, keypoints |
RandomShear | Affine (shear parameter) | - Similar core functionality: apply shear transformation - Key similarities:   1. Both have default probability p=0.5   2. Both support different interpolation methods   3. Both support independent x/y shear control - Key differences:   1. Parameter specification:     - Kornia:       * Dedicated shear transform       * shear parameter supports float, tuple(2), or tuple(4)      * Simple padding modes (zeros, border, reflection)     - Albumentations:       * Part of general Affine transform       * shear supports number, tuple, or dict format      * More border modes and fill options   2. Additional features in Albumentations:     * Separate mask interpolation     * fit_output option    * Combined with other affine transforms   3. Target handling:     - Kornia: Image tensors only     - Albumentations: Images, masks, keypoints, bboxes |
RandomThinPlateSpline | ThinPlateSpline | - Similar core functionality: apply smooth, non-rigid deformations - Key similarities:   1. Both have default probability p=0.5   2. Both use thin plate spline algorithm   3. Both support interpolation options - Key differences:   1. Deformation control:     - Kornia:       * Single scale parameter (default: 0.2)      * Fixed control point grid     - Albumentations:       * scale_range tuple for range of deformation      * Configurable num_control_points   2. Implementation details:     - Kornia:       * align_corners parameter      * Binary mode choice (bilinear/nearest)     - Albumentations:       * OpenCV interpolation flags       * More granular control over grid   3. Target handling:     - Kornia: Image tensors only     - Albumentations: Images, masks, keypoints, bboxes |
RandomVerticalFlip | VerticalFlip | - Similar core functionality: flip image vertically - Key similarities:   1. Both have default probability p=0.5   2. Simple operation with same visual result - Key differences:   1. Implementation:     - Kornia:       * Additional p_batch parameter    - Albumentations:       * Simpler implementation   2. Target handling:     - Kornia: Image tensors only     - Albumentations: Images, masks, bboxes, keypoints |
Key Differences 🔗
Compared to TorchVision 🔗
- Albumentations operates on numpy arrays instead of PyTorch tensors
- Albumentations typically provides more parameters for fine-tuning transformations
- Most Albumentations transforms support both image and mask augmentation
- Better support for bounding box and keypoint augmentation
Compared to Kornia 🔗
- Kornia operates directly on GPU tensors, while Albumentations works with numpy arrays
- Albumentations provides more comprehensive support for object detection and segmentation tasks
- Albumentations typically offers better performance for CPU-based augmentations
Performance Comparison 🔗
According to benchmarking results, Albumentations generally offers superior CPU performance compared to TorchVision and Kornia for most transforms. Here are some key highlights: Common Transforms Performance (images/second, higher is better)
Transform | Albumentations | TorchVision | Kornia | Notes |
---|---|---|---|---|
HorizontalFlip | 8,618 | 914 | 390 | Albumentations is ~9x faster than TorchVision, ~22x faster than Kornia |
VerticalFlip | 22,847 | 3,198 | 1,212 | Albumentations is ~7x faster than TorchVision, ~19x faster than Kornia |
RandomResizedCrop | 2,828 | 511 | 287 | Albumentations is ~5.5x faster than TorchVision, ~10x faster than Kornia |
Normalize | 1,196 | 519 | 626 | Albumentations is ~2x faster than both |
ColorJitter | 628 | 46 | 55 | Albumentations is ~13x faster than both |
Key Performance Insights: 🔗
- Basic Operations: Albumentations excels at basic transforms like flips and crops, often being 5-20x faster than alternatives
- Complex Operations: For more complex transforms like elastic deformation, the performance gap narrows
- Memory Efficiency: Working with numpy arrays (Albumentations) is generally more memory efficient than tensor operations (Kornia/TorchVision) on CPU
When to Choose Each Library: 🔗
- Albumentations: Best choice for CPU-based preprocessing pipelines and when maximum performance is needed
- Kornia: Consider when doing augmentation on GPU with existing PyTorch tensors
- TorchVision: Good choice when deeply integrated into PyTorch ecosystem and GPU performance isn't critical
Note: Benchmarks performed on macOS-15.0.1-arm64 with Python 3.12.7. Your results may vary based on hardware and setup.
Code Examples 🔗
TorchVision to Albumentations 🔗
# TorchVision
transforms = T.Compose([
T.RandomHorizontalFlip(p=0.5),
T.RandomRotation(10),
T.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
# Albumentations equivalent
transforms = A.Compose([
A.HorizontalFlip(p=0.5),
A.Rotate(limit=10),
A.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
Kornia to Albumentations 🔗
# Kornia
transforms = K.AugmentationSequential(
K.RandomHorizontalFlip(p=0.5),
K.RandomRotation(degrees=10),
K.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
)
# Albumentations equivalent
transforms = A.Compose([
A.HorizontalFlip(p=0.5),
A.Rotate(limit=10),
A.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])