Transform Library Comparison Guide¶
This guide helps you find equivalent transforms between Albumentations and other popular libraries (torchvision and Kornia).
Key Differences¶
Compared to TorchVision¶
- Albumentations operates on numpy arrays (TorchVision uses PyTorch tensors)
- More parameters for fine-tuning transformations
- Built-in support for mask augmentation
- Better handling of bounding boxes and keypoints
Compared to Kornia¶
- CPU-based numpy operations (Kornia uses GPU tensors)
- More comprehensive support for detection/segmentation
- Generally better CPU performance
- Simpler API for common tasks
Common Transform Mappings¶
Basic Geometric Transforms¶
TorchVision Transform | Albumentations Equivalent | Notes |
---|---|---|
Resize | Resize / LongestMaxSize | - TorchVision's Resize combines two Albumentations behaviors:1. When given (h,w): equivalent to Albumentations Resize 2. When given single int + max_size: similar to LongestMaxSize - Albumentations allows separate interpolation method for masks - TorchVision has antialias parameter, Albumentations doesn't |
ScaleJitter | OneOf + multiple Resize | - Can be approximated in Albumentations using OneOf container with multiple Resize transforms - Example: transforms = A.OneOf([ A.Resize(height=int(target_h * scale), width=int(target_w * scale)) for scale in np.linspace(0.1, 2.0, num=20) ]) - Not exactly the same as continuous random scaling, but provides similar functionality |
RandomShortestSize | OneOf + SmallestMaxSize | - Can be approximated in Albumentations using:transforms = A.OneOf([ A.SmallestMaxSize(max_size=size, max_height=max_size, max_width=max_size) for size in [480, 512, 544, 576, 608] ]) - Randomly selects size for shortest side while maintaining aspect ratio - Optional max_size parameter limits longest side- TorchVision has antialias parameter, Albumentations doesn't |
RandomResize | OneOf + Resize | - TorchVision: randomly selects single size S between min_size and max_size , sets both width and height to S - No direct equivalent in Albumentations (RandomScale preserves aspect ratio) - Can be approximated using: transforms = A.OneOf([ A.Resize(size, size) for size in range(min_size, max_size + 1, step) ]) |
RandomCrop | RandomCrop | - Both perform random cropping with similar core functionality - Key differences: 1. TorchVision accepts single int for square crop, Albumentations requires both height and width 2. Padding options differ: - TorchVision: supports padding parameter for pre-padding - Albumentations: offers pad_position parameter ('center', 'top_left', etc.)3. Fill value handling: - TorchVision: supports dict mapping for different types - Albumentations: separate fill and fill_mask parameters4. Padding modes: - TorchVision: 'constant', 'edge', 'reflect', 'symmetric' - Albumentations: uses OpenCV border modes |
RandomResizedCrop | RandomResizedCrop | - Nearly identical functionality and parameters - Key differences: 1. TorchVision accepts single int for square output, Albumentations requires (height, width) tuple2. Default values are the same (scale=(0.08, 1.0), ratio=(0.75, 1.3333)) 3. Albumentations adds: - Separate mask_interpolation parameter- Probability parameter p |
RandomIoUCrop | RandomSizedBBoxSafeCrop | - Both ensure safe cropping with respect to bounding boxes - Key differences: 1. TorchVision: - Implements exact SSD paper approach - Uses IoU-based sampling strategy - Requires explicit sanitization of boxes after crop 2. Albumentations: - Simpler approach ensuring bbox safety - Directly specifies target size - Automatically handles bbox cleanup - For exact SSD-style cropping, might need custom implementation in Albumentations |
CenterCrop | CenterCrop | - Both crop the center part of the input - Key differences: 1. Size specification: - TorchVision: accepts single int for square crop or (height, width) tuple - Albumentations: requires separate height and width parameters 2. Padding behavior: - TorchVision: always pads with 0 if image is smaller - Albumentations: optional padding with pad_if_needed 3. Albumentations adds: - Configurable padding mode and position - Separate fill values for image and mask - Probability parameter p |
RandomHorizontalFlip | HorizontalFlip | - Identical functionality - Both have default probability p=0.5 - Only naming difference: TorchVision includes "Random" in name |
RandomVerticalFlip | VerticalFlip | - Identical functionality - Both have default probability p=0.5 - Only naming difference: TorchVision includes "Random" in name |
Pad | Pad | - Similar core padding functionality - Both support: - Single int for all sides - (pad_x, pad_y) for symmetric padding - (left, top, right, bottom) for per-side padding - Key differences: 1. Padding modes: - TorchVision: 'constant', 'edge', 'reflect', 'symmetric' - Albumentations: uses OpenCV border modes 2. Fill value handling: - TorchVision: supports dict mapping for different types - Albumentations: separate fill and fill_mask parameters3. Albumentations adds: - Probability parameter p |
RandomZoomOut | RandomScale + PadIfNeeded | - No direct equivalent in Albumentations - Can be approximated by combining: A.Compose([ A.RandomScale(scale_limit=(0.0, 3.0), p=0.5), # scale_limit=(0.0, 3.0) maps to side_range=(1.0, 4.0) A.PadIfNeeded(min_height=height, min_width=width, border_mode=cv2.BORDER_CONSTANT, value=fill) ]) - Key differences: 1. TorchVision implements specific SSD paper approach 2. Albumentations requires composition of two transforms |
RandomRotation | Rotate | - Similar core rotation functionality but with different parameters - Key differences: 1. Angle specification: - TorchVision: degrees parameter (-degrees, +degrees) or (min, max)- Albumentations: limit parameter (-limit, +limit) or (min, max)2. Output size control: - TorchVision: expand=True/False - Albumentations: crop_border=True/False 3. Additional Albumentations features: - Separate mask interpolation - Bbox rotation methods ('largest_box' or 'ellipse') - More border modes - Probability parameter p 4. Center specification: - TorchVision: supports custom center point - Albumentations: always uses image center |
RandomAffine | Affine | - Both support core affine operations (translation, rotation, scale, shear) - Key differences: 1. Parameter specification: - TorchVision: single parameters for each transform - Albumentations: more flexible with dict options for x/y axes 2. Scale handling: - Albumentations adds keep_ratio and balanced_scale - Albumentations supports independent x/y scaling 3. Translation: - TorchVision: fraction only - Albumentations: both percent and pixels 4. Additional Albumentations features: - fit_output to adjust image plane- Separate mask interpolation - More border modes - Bbox rotation methods - Probability parameter p |
RandomPerspective | Perspective | - Both apply random perspective transformations - Key differences: 1. Distortion control: - TorchVision: single distortion_scale (0 to 1)- Albumentations: scale tuple for corner movement range2. Output handling: - Albumentations adds keep_size and fit_output options- Can control whether to maintain original size 3. Additional Albumentations features: - Separate mask interpolation - More border modes - Better control over output size and fitting |
ElasticTransform | ElasticTransform | - Similar core functionality: both apply elastic deformations to images - Key differences: 1. Parameters have opposite meanings: - TorchVision: alpha (displacement), sigma (smoothness)- Albumentations: alpha (smoothness), sigma (displacement)2. Default values reflect this difference: - TorchVision: alpha=50.0, sigma=5.0 - Albumentations: alpha=1.0, sigma=50.0 - Note on implementation: - Albumentations follows Simard et al. 2003 paper more closely: - σ should be ~0.05 * image_size - α should be proportional to σ - Additional Albumentations features: - approximate mode- same_dxdy option- Choice of noise distribution - Separate mask interpolation |
ColorJitter | ColorJitter | - Similar core functionality: both randomly adjust brightness, contrast, saturation, and hue - Key similarities: 1. Same parameter names and meanings 2. Same value ranges (e.g., hue should be in [-0.5, 0.5]) 3. Random order of transformations - Key differences: 1. Default values: - TorchVision: all None by default - Albumentations: defaults to (0.8, 1.2) for brightness/contrast/saturation 2. Implementation: - TorchVision: uses Pillow - Albumentations: uses OpenCV (may produce slightly different results) 3. Additional in Albumentations: - Explicit probability parameter p - Value saturation instead of uint8 overflow |
RandomChannelPermutation | ChannelShuffle | - Both randomly permute image channels - Key similarities: 1. Same core functionality 2. Work on multi-channel images (typically RGB) - Key differences: 1. Naming convention only 2. Albumentations adds: - Probability parameter p |
RandomPhotometricDistort | RandomOrder + ColorJitter + ChannelShuffle | - TorchVision's transform is from SSD paper, combines: 1. Color jittering (brightness, contrast, saturation, hue) 2. Random channel permutation - Can be replicated in Albumentations using: A.RandomOrder([ A.ColorJitter(brightness=(0.875, 1.125), contrast=(0.5, 1.5), saturation=(0.5, 1.5), hue=(-0.05, 0.05), p=0.5), A.ChannelShuffle(p=0.5) ]) |
Grayscale | ToGray | - Similar core functionality: convert RGB to grayscale - Key differences: 1. Output channels: - TorchVision: only 1 or 3 channels - Albumentations: supports any number of output channels 2. Conversion methods: - TorchVision: single method (weighted RGB) - Albumentations: multiple methods via method parameter:• weighted_average (default, same as TorchVision) • from_lab, desaturation, average, max, pca 3. Additional in Albumentations: - Probability parameter p - More flexible channel handling |
RGB | ToRGB | - Similar core functionality: convert to RGB format - Key differences: 1. Input handling: - TorchVision: accepts 1 or 3 channel inputs - Albumentations: only accepts single-channel inputs 2. Output channels: - TorchVision: always 3 channels - Albumentations: configurable via num_output_channels 3. Behavior: - TorchVision: converts to RGB if not already RGB - Albumentations: strictly grayscale to RGB conversion 4. Additional in Albumentations: - Probability parameter p |
RandomGrayscale | ToGray | - Similar core functionality: convert to grayscale with probability - Key differences: 1. Default probability: - TorchVision: p=0.1 - Albumentations: p=0.5 2. Output handling: - TorchVision: always preserves input channels - Albumentations: configurable output channels 3. Conversion methods: - TorchVision: single method - Albumentations: multiple methods with different channel support: • weighted_average, from_lab: 3-channel only • desaturation, average, max, pca: any number of channels 4. Channel requirements: - TorchVision: works with 1 or 3 channels - Albumentations: depends on method chosen |
GaussianBlur | GaussianBlur | - Similar core functionality: apply Gaussian blur with random kernel size - Key similarities: 1. Both support random kernel sizes 2. Both support random sigma values - Key differences: 1. Parameter specification: - TorchVision: kernel_size (exact size), sigma (range)- Albumentations: blur_limit (size range), sigma_limit (range)2. Kernel size constraints: - TorchVision: must specify exact size - Albumentations: can specify range (3, 7) or auto-compute 3. Additional in Albumentations: - Probability parameter p - Auto-computation of kernel size from sigma |
GaussianNoise | GaussNoise | - Similar core functionality: add Gaussian noise to images - Key similarities: 1. Both support mean and standard deviation parameters - Key differences: 1. Parameter ranges: - TorchVision: fixed values for mean and sigma - Albumentations: ranges for both ( std_range , mean_range )2. Value handling: - TorchVision: expects float [0,1], has clip option - Albumentations: auto-scales based on dtype 3. Additional in Albumentations: - Per-channel noise option - Noise scale factor for performance - Probability parameter p |
RandomInvert | InvertImg | - Similar core functionality: invert image colors - Key similarities: 1. Both invert pixel values 2. Both have default probability of 0.5 - Key differences: 1. Value handling: - TorchVision: works with [0,1] float tensors - Albumentations: auto-handles uint8 (255) and float32 (1.0) |
RandomPosterize | Posterize | - Similar core functionality: reduce color bits - Key similarities: 1. Both posterize images with probability p=0.5 - Key differences: 1. Bits specification: - TorchVision: single fixed value [0-8] - Albumentations: flexible options with [1-7] (recommended): • Single value for all channels • Range (min_bits, max_bits) • Per-channel values [r,g,b] • Per-channel ranges [(r_min,r_max), ...] 2. Practical range: - TorchVision: includes 0 (black) and 8 (unchanged) - Albumentations: recommended [1-7] for actual posterization |
RandomSolarize | Solarize | - Similar core functionality: invert pixels above threshold - Key similarities: 1. Both have default probability p=0.5 2. Both invert values above threshold - Key differences: 1. Threshold specification: - TorchVision: single fixed threshold value - Albumentations: range via threshold_range 2. Value handling: - TorchVision: works with raw threshold values - Albumentations: uses normalized [0,1] range: • uint8: multiplied by 255 • float32: multiplied by 1.0 |
RandomAdjustSharpness | Sharpen | - Similar core functionality: adjust image sharpness - Key similarities: 1. Both have default probability p=0.5 - Key differences: 1. Parameter specification: - TorchVision: single sharpness_factor • 0: blurred • 1: original • 2: doubled sharpness - Albumentations: more controls: • alpha : effect visibility [0,1]• lightness : contrast control2. Method options: - TorchVision: single method - Albumentations: two methods: • 'kernel': Laplacian operator • 'gaussian': blur interpolation |
RandomAutocontrast | AutoContrast | Same core functionality with identical parameters (p=0.5) |
RandomEqualize | Equalize | - Similar core functionality: histogram equalization - Key similarities: 1. Both have default probability p=0.5 - Key differences: 1. Additional Albumentations features: - Choice of algorithm (cv/pil methods) - Per-channel or luminance-based equalization - Optional masking support |
Normalize | Normalize | - Similar core functionality: normalize image values - Key similarities: 1. Both support mean/std normalization - Key differences: 1. Normalization options: - TorchVision: only (input - mean) / std - Albumentations: multiple methods: • standard (same as TorchVision) • image (global stats) • image_per_channel • min_max • min_max_per_channel 2. Additional in Albumentations: - max_pixel_value parameter- Probability parameter p |
RandomErasing | Erasing | - Similar core functionality: randomly erase image regions - Key similarities: 1. Both have default probability p=0.5 2. Same default scale=(0.02, 0.33) 3. Same default ratio=(0.3, 3.3) - Key differences: 1. Fill value options: - TorchVision: number/tuple or 'random' - Albumentations: additional options: • random_uniform • inpaint_telea • inpaint_ns 2. Additional in Albumentations: - Mask fill value option - Support for masks, bboxes, keypoints |
JPEG | ImageCompression | - Similar core functionality: apply JPEG compression - Key similarities: 1. Both use quality range 1-100 2. Both support quality ranges - Key differences: 1. Compression types: - TorchVision: JPEG only - Albumentations: JPEG and WebP 2. Additional in Albumentations: - Probability parameter p - Default quality range (99, 100) |
Kornia to Albumentations¶
Kornia | Albumentations | Notes |
---|---|---|
ColorJitter | ColorJitter | - Similar core functionality: randomly adjust brightness, contrast, saturation, and hue - Key similarities: 1. Both support same parameters (brightness, contrast, saturation, hue) 2. Both allow float or tuple ranges for parameters - Key differences: 1. Default values: - Albumentations: (0.8, 1.2) for brightness/contrast/saturation - Kornia: 0.0 for all parameters 2. Default probability: - Albumentations: p=0.5 - Kornia: p=1.0 3. Note: Kornia recommends using ColorJiggle instead as it follows color theory better |
RandomAutoContrast | AutoContrast | - Similar core functionality: enhance image contrast automatically - Key similarities: 1. Both stretch intensity range to use full range 2. Both preserve relative intensities - Key differences: 1. Default probability: - Albumentations: p=0.5 - Kornia: p=1.0 2. Additional in Kornia: - clip_output parameter to control value clipping |
RandomBoxBlur | Blur | - Similar core functionality: apply box/average blur to images - Key similarities: 1. Both have default probability p=0.5 2. Both apply box/average blur filter - Key differences: 1. Kernel size specification: - Albumentations: blur_limit parameter for range (e.g., (3, 7))- Kornia: fixed kernel_size tuple (default (3, 3))2. Additional in Kornia: - border_type parameter ('reflect', 'replicate', 'circular')- normalized parameter for L1 norm control |
RandomBrightness | RandomBrightnessContrast | - Different scope: - Kornia: brightness only - Albumentations: combines brightness and contrast - Key differences: 1. Parameter specification: - Kornia: brightness tuple (default: (1.0, 1.0))- Albumentations: brightness_limit (default: (-0.2, 0.2))2. Default probability: - Kornia: p=1.0 - Albumentations: p=0.5 3. Additional in Albumentations: - brightness_by_max parameter for adjustment method- ensure_safe_range to prevent overflow/underflow- Combined contrast control 4. Additional in Kornia: - clip_output parameter |
RandomChannelDropout | ChannelDropout | - Similar core functionality: randomly drop image channels - Key similarities: 1. Both have default probability p=0.5 2. Both allow specifying fill value for dropped channels - Key differences: 1. Channel drop specification: - Kornia: fixed num_drop_channels (default: 1)- Albumentations: flexible channel_drop_range tuple (default: (1, 1))2. Error handling: - Albumentations: explicit checks for single-channel images and invalid ranges - Kornia: simpler parameter validation |
RandomChannelShuffle | ChannelShuffle | - Identical core functionality: randomly shuffle image channels |
RandomClahe | CLAHE | - Similar core functionality: apply Contrast Limited Adaptive Histogram Equalization - Key similarities: 1. Both have default probability p=0.5 2. Both allow configuring grid size and clip limit - Key differences: 1. Parameter defaults: - Kornia: clip_limit=(40.0, 40.0), grid_size=(8, 8) - Albumentations: clip_limit=(1, 4), tile_grid_size=(8, 8) 2. Additional in Kornia: - slow_and_differentiable parameter for implementation choice |
RandomContrast | RandomBrightnessContrast | - Different scope: - Kornia: contrast only - Albumentations: combines brightness and contrast - Key differences: 1. Parameter specification: - Kornia: contrast tuple (default: (1.0, 1.0))- Albumentations: contrast_limit (default: (-0.2, 0.2))2. Default probability: - Kornia: p=1.0 - Albumentations: p=0.5 3. Additional in Albumentations: - ensure_safe_range to prevent overflow/underflow- Combined brightness control 4. Additional in Kornia: - clip_output parameter |
RandomEqualize | Equalize | - Similar core functionality: apply histogram equalization - Key similarities: 1. Both have default probability p=0.5 - Key differences: 1. Additional in Albumentations: - mode parameter to choose between 'cv' and 'pil' methods- by_channels parameter for per-channel or luminance-based equalization- mask parameter to selectively apply equalization- mask_params for dynamic mask generation |
RandomGamma | RandomGamma | - Similar core functionality: apply random gamma correction - Key differences: 1. Parameter specification: - Kornia: separate gamma (1.0, 1.0) and gain (1.0, 1.0) tuples- Albumentations: single gamma_limit (80, 120) as percentage range2. Default probability: - Kornia: p=1.0 - Albumentations: p=0.5 3. Additional in Albumentations: - eps parameter to prevent numerical errors |
RandomGaussianBlur | GaussianBlur | - Similar core functionality: apply Gaussian blur with random parameters - Key similarities: 1. Both have default probability p=0.5 2. Both support kernel size and sigma parameters - Key differences: 1. Parameter specification: - Kornia: requires explicit kernel_size and sigma range- Albumentations: blur_limit (default: (3, 7)) and sigma_limit (default: 0)2. Additional in Kornia: - border_type parameter for padding mode- separable parameter for 1D convolution optimization |
RandomGaussianIllumination | Illumination | - Similar core functionality: apply illumination effects - Key similarities: 1. Both have default probability p=0.5 2. Both support controlling effect intensity and position - Key differences: 1. Scope: - Kornia: Gaussian illumination patterns only - Albumentations: Multiple modes (linear, corner, gaussian) 2. Parameter ranges: - Kornia: gain=(0.01, 0.15), center=(0.1, 0.9), sigma=(0.2, 1.0) - Albumentations: intensity_range=(0.01, 0.2), center_range=(0.1, 0.9), sigma_range=(0.2, 1.0) 3. Additional in Albumentations: - mode parameter for different effect types- effect_type for brighten/darken control- angle_range for linear gradients4. Additional in Kornia: - sign parameter for effect direction |
RandomGaussianNoise | GaussNoise | - Similar core functionality: add Gaussian noise to images - Key similarities: 1. Both have default probability p=0.5 - Key differences: 1. Parameter specification: - Kornia: fixed mean (default: 0.0) and std (default: 1.0)- Albumentations: ranges via std_range (0.2, 0.44) and mean_range (0.0, 0.0)2. Additional in Albumentations: - per_channel parameter for independent channel noise- noise_scale_factor for performance optimization- Automatic value scaling based on image dtype |
RandomGrayscale | ToGray | - Similar core functionality: convert images to grayscale - Key differences: 1. Default probability: - Kornia: p=0.1 - Albumentations: p=0.5 2. Conversion options: - Kornia: customizable rgb_weights for channel mixing- Albumentations: multiple method options (weighted_average, from_lab, desaturation, average, max, pca)3. Output control: - Kornia: always 3-channel output - Albumentations: configurable num_output_channels |
RandomHue | ColorJitter (hue parameter) | - Similar core functionality: adjust image hue - Key differences: 1. Scope: - Kornia: hue-only transform - Albumentations: part of ColorJitter with brightness, contrast, and saturation 2. Default values: - Kornia: hue=(0.0, 0.0), p=1.0 - Albumentations: hue=(-0.5, 0.5), p=0.5 |
RandomInvert | InvertImg | - Similar core functionality: invert image values - Key differences: 1. Maximum value handling: - Kornia: configurable via max_val parameter (default: 1.0)- Albumentations: automatically determined by dtype (255 for uint8, 1.0 for float32) |
RandomJPEG | ImageCompression | - Similar core functionality: apply image compression - Key differences: 1. Compression options: - Kornia: JPEG only - Albumentations: supports both JPEG and WebP 2. Quality specification: - Kornia: jpeg_quality (default: 50.0)- Albumentations: quality_range (default: (99, 100))3. Default probability: - Kornia: p=1.0 - Albumentations: p=0.5 |
RandomLinearCornerIllumination | Illumination (corner mode) | - Similar core functionality: apply corner illumination effects - Key differences: 1. Scope: - Kornia: corner illumination only - Albumentations: part of general Illumination transform with multiple modes 2. Parameter specification: - Kornia: gain (0.01, 0.2) and sign (-1.0, 1.0)- Albumentations: intensity_range (0.01, 0.2) and effect_type (brighten/darken/both)3. Additional in Albumentations: - Multiple illumination modes (linear, corner, gaussian) - More control over effect parameters |
RandomLinearIllumination | Illumination (linear mode) | - Similar core functionality: apply linear illumination effects - Key differences: 1. Scope: - Kornia: linear illumination only - Albumentations: part of general Illumination transform with multiple modes 2. Parameter specification: - Kornia: gain (0.01, 0.2) and sign (-1.0, 1.0)- Albumentations: intensity_range (0.01, 0.2), effect_type (brighten/darken/both), and angle_range (0, 360)3. Additional in Albumentations: - Multiple illumination modes (linear, corner, gaussian) - Explicit angle control for gradient direction |
RandomMedianBlur | MedianBlur | - Similar core functionality: apply median blur filter - Key similarities: 1. Both have default probability p=0.5 - Key differences: 1. Kernel size specification: - Kornia: fixed kernel_size tuple (default: (3, 3))- Albumentations: range via blur_limit (default: (3, 7))2. Kernel constraints: - Albumentations: enforces odd kernel sizes |
RandomMotionBlur | MotionBlur | - Similar core functionality: apply directional motion blur - Key similarities: 1. Both have default probability p=0.5 2. Both support angle and direction control - Key differences: 1. Kernel size specification: - Kornia: kernel_size as int or tuple- Albumentations: blur_limit (default: (3, 7))2. Angle control: - Kornia: angle parameter with symmetric range (-angle, angle)- Albumentations: angle_range (default: (0, 360))3. Additional in Albumentations: - allow_shifted parameter for kernel position control |
RandomPlanckianJitter | PlanckianJitter | - Similar core functionality: apply physics-based color temperature variations - Key similarities: 1. Both have default probability p=0.5 2. Both support 'blackbody' and 'cied' modes - Key differences: 1. Temperature control: - Kornia: select_from parameter for discrete jitter selection- Albumentations: temperature_limit for continuous range2. Additional in Albumentations: - sampling_method parameter ('uniform' or 'gaussian')- More detailed control over temperature ranges - Better documentation of physics-based effects |
RandomPlasmaBrightness | PlasmaBrightnessContrast | - Similar core functionality: apply fractal-based brightness adjustments - Key similarities: 1. Both have default probability p=0.5 2. Both use Diamond-Square algorithm for pattern generation - Key differences: 1. Parameter specification: - Kornia: roughness (0.1, 0.7) and intensity (0.0, 1.0)- Albumentations: brightness_range (-0.3, 0.3), contrast_range (-0.3, 0.3), roughness (default: 3.0)2. Additional in Albumentations: - Combined brightness and contrast adjustment - plasma_size parameter for pattern detail control- More detailed mathematical formulation and documentation |
RandomPlasmaContrast | PlasmaBrightnessContrast | - Similar core functionality: apply fractal-based contrast adjustments - Key similarities: 1. Both have default probability p=0.5 2. Both use Diamond-Square algorithm for pattern generation - Key differences: 1. Parameter specification: - Kornia: roughness (0.1, 0.7) only- Albumentations: contrast_range (-0.3, 0.3), roughness (default: 3.0), plasma_size (default: 256)2. Scope: - Kornia: contrast-only adjustment - Albumentations: combined brightness and contrast adjustment 3. Additional in Albumentations: - More detailed mathematical formulation - Pattern size control via plasma_size |
RandomPlasmaShadow | PlasmaShadow | - Similar core functionality: apply fractal-based shadow effects - Key similarities: 1. Both have default probability p=0.5 2. Both use Diamond-Square algorithm for pattern generation - Key differences: 1. Parameter specification: - Kornia: roughness (0.1, 0.7), shade_intensity (-1.0, 0.0), shade_quantity (0.0, 1.0)- Albumentations: shadow_intensity_range (0.3, 0.7), plasma_size (default: 256), roughness (default: 3.0)2. Additional in Albumentations: - Pattern size control via plasma_size - More intuitive intensity range (0 to 1) - More detailed mathematical formulation and documentation |
RandomPosterize | Posterize | - Similar core functionality: reduce color bits in image - Key similarities: 1. Both have default probability p=0.5 2. Both operate on color bit reduction - Key differences: 1. Bit specification: - Kornia: bits parameter (default: 3) with range (0, 8], can be float or tuple- Albumentations: num_bits parameter (default: 4) with range [1, 7], supports multiple formats:* Single int for all channels * Tuple for random range * List for per-channel specification * List of tuples for per-channel ranges 2. Additional in Albumentations: - More flexible channel-wise control - More detailed documentation and mathematical background |
RandomRain | RandomRain | - Similar core functionality: add rain effects to images - Key similarities: 1. Both have default probability p=0.5 - Key differences: 1. Rain parameter specification: - Kornia: number_of_drops (1000, 2000), drop_height (5, 20), drop_width (-5, 5)- Albumentations: slant_range (-10, 10), drop_length (20), drop_width (1)2. Additional in Albumentations: - drop_color customization- blur_value for atmospheric effect- brightness_coefficient for lighting adjustment- rain_type presets (drizzle, heavy, torrential)3. Approach: - Kornia: Direct drop placement - Albumentations: More realistic simulation with slant, blur, and brightness effects |
RandomRGBShift | AdditiveNoise | - Similar core functionality: add noise/shifts to image channels - Key similarities: 1. Both have default probability p=0.5 2. Both can affect individual channels - Key differences: 1. Approach: - Kornia: Simple RGB channel shifts with individual limits - Albumentations: More sophisticated noise generation with multiple distributions 2. Parameter specification: - Kornia: r_shift_limit , g_shift_limit , b_shift_limit (all default: 0.5)- Albumentations: Flexible noise configuration with: * Multiple noise types (uniform, gaussian, laplace, beta) * Different spatial modes (constant, per_pixel, shared) * Customizable distribution parameters 3. Additional in Albumentations: - Performance optimization options - More detailed control over noise distribution - Spatial application modes |
RandomSaltAndPepperNoise | SaltAndPepper | - Similar core functionality: apply salt and pepper noise to images - Key similarities: 1. Both have default probability p=0.5 2. Both use same default parameters: - amount (0.01, 0.06)- salt_vs_pepper (0.4, 0.6)- Key differences: 1. Parameter flexibility: - Kornia: Supports single float or tuple for parameters - Albumentations: Requires tuples for ranges 2. Documentation: - Albumentations provides: * Detailed mathematical formulation * Clear examples for different noise levels * Implementation notes and edge cases * References to academic sources |
RandomSaturation | ColorJitter | - Different scope and functionality: - Key differences: 1. Scope: - Kornia: Saturation-only adjustment - Albumentations: Combined brightness, contrast, saturation, and hue adjustment 2. Default parameters: - Kornia: saturation (1.0, 1.0), p=1.0- Albumentations: saturation (0.8, 1.2), p=0.53. Implementation: - Kornia: Aligns with PIL/TorchVision implementation - Albumentations: Uses OpenCV with noted differences in HSV conversion 4. Additional in Albumentations: - Brightness adjustment - Contrast adjustment - Hue adjustment - Random order of transformations |
RandomSharpness | Sharpen | - Similar core functionality: sharpen images - Key similarities: 1. Both have default probability p=0.5 - Key differences: 1. Parameter specification: - Kornia: Single sharpness parameter (default: 0.5)- Albumentations: More detailed control with: * alpha (0.2, 0.5) for effect visibility* lightness (0.5, 1.0) for contrast* method choice ('kernel' or 'gaussian')* kernel_size and sigma for gaussian method2. Implementation methods: - Kornia: Single approach - Albumentations: Two methods: * Kernel-based using Laplacian operator * Gaussian interpolation 3. Documentation: - Albumentations provides detailed mathematical formulation and references |
RandomSnow | RandomSnow | - Similar core functionality: add snow effects to images - Key differences: 1. Parameter specification: - Kornia: snow_coefficient (0.5, 0.5), brightness (2, 2), p=1.0- Albumentations: snow_point_range (0.1, 0.3), brightness_coeff (2.5), p=0.52. Implementation methods: - Kornia: Single approach - Albumentations: Two methods: * "bleach": Simple pixel value thresholding * "texture": Advanced snow texture simulation 3. Additional in Albumentations: - Detailed snow simulation with: * HSV color space manipulation * Gaussian noise for texture * Depth effect simulation * Sparkle effects 4. Documentation: - Albumentations provides detailed mathematical formulation and implementation notes |
RandomSolarize | Solarize | - Similar core functionality: invert pixel values above threshold - Key similarities: 1. Both have default probability p=0.5 - Key differences: 1. Parameter specification: - Kornia: Two parameters: * thresholds (default: 0.1) for threshold range* additions (default: 0.1) for value adjustment- Albumentations: Single parameter: * threshold_range (default: (0.5, 0.5))2. Threshold handling: - Kornia: Generates from (0.5 - x, 0.5 + x) for float input - Albumentations: Direct range specification, scaled by image type max value 3. Documentation: - Albumentations provides: * Detailed examples for both uint8 and float32 images * Clear mathematical formulation * Image type-specific behavior explanation |
CenterCrop | CenterCrop | - Similar core functionality: crop center of image - Key similarities: 1. Both have default probability p=1.0 - Key differences: 1. Size specification: - Kornia: Single size parameter (int or tuple)- Albumentations: Separate height and width parameters2. Additional features: - Kornia: * align_corners for interpolation* resample mode selection* cropping_mode ('slice' or 'resample')- Albumentations: * pad_if_needed for handling small images* border_mode for padding method* fill and fill_mask for padding values* pad_position options3. Target handling: - Kornia: Image tensors only - Albumentations: Supports images, masks, bboxes, and keypoints |
PadTo | PadIfNeeded | - Can achieve same core functionality - Key similarities: 1. Both have default probability p=1.0 2. Can pad to exact size: - Kornia: size=(height, width) - Albumentations: min_height=height, min_width=width - Key differences: 1. Parameter naming: - Kornia: Single size tuple- Albumentations: Separate dimension parameters 2. Additional features: - Kornia: * Simple pad_mode selection* Single pad_value - Albumentations: * Flexible position options* Separate fill and fill_mask * Optional divisibility padding * Multiple target support 3. Target handling: - Kornia: Image tensors only - Albumentations: Images, masks, bboxes, keypoints |
RandomAffine | Affine | - Similar core functionality: apply affine transformations - Key similarities: 1. Both have default probability p=0.5 2. Both support rotation, translation, scaling, and shear - Key differences: 1. Parameter specification: - Kornia: * degrees for rotation* translate as fraction* scale as tuple* shear in degrees- Albumentations: * More flexible parameter formats * Supports both percent and pixel translation * Dictionary format for independent axis control 2. Additional features in Albumentations: * fit_output for automatic size adjustment* keep_ratio for aspect ratio preservation* rotate_method options* balanced_scale for even scale distribution3. Target handling: - Kornia: Image tensors only - Albumentations: Images, masks, keypoints, bboxes |
RandomCrop | RandomCrop | - Similar core functionality: randomly crop image patches - Key similarities: 1. Both have default probability p=1.0 2. Both support padding if needed - Key differences: 1. Size specification: - Kornia: Single size tuple (height, width)- Albumentations: Separate height and width parameters2. Padding options: - Kornia: * Flexible padding sizes (int, tuple[2], tuple[4]) * Multiple padding modes (constant, reflect, replicate) * Single fill value - Albumentations: * Simpler padding interface * Separate fill values for image and mask * Flexible pad positioning 3. Target handling: - Kornia: Image tensors only - Albumentations: Images, masks, bboxes, keypoints |
RandomElasticTransform | ElasticTransform | - Similar core functionality: apply elastic deformations - Key similarities: 1. Both have default probability p=0.5 2. Both use Gaussian smoothing for displacement fields 3. Both support independent control of x/y deformations: - Kornia: via separate values in sigma /alpha tuples- Albumentations: via same_dxdy parameter- Key differences: 1. Parameter specification: - Kornia: * kernel_size tuple (63, 63)* sigma tuple (32.0, 32.0)* alpha tuple (1.0, 1.0)- Albumentations: * Single sigma (default: 50.0)* Single alpha (default: 1.0)2. Additional features: - Kornia: * Control over padding mode - Albumentations: * approximate mode for faster processing* Choice of noise distribution * Separate mask interpolation 3. Target handling: - Kornia: Image tensors only - Albumentations: Images, masks, bboxes, keypoints |
RandomErasing | Erasing | - Similar core functionality: randomly erase rectangular regions - Key similarities: 1. Both have default probability p=0.5 2. Same default parameters: * scale (0.02, 0.33)* ratio (0.3, 3.3)- Key differences: 1. Fill value options: - Kornia: Simple numeric value (default: 0.0)- Albumentations: Rich fill options:* Numeric values * "random" per pixel * "random_uniform" per region * "inpaint_telea" method * "inpaint_ns" method 2. Additional features in Albumentations: * Separate mask_fill value* Support for masks, bboxes, keypoints * Inpainting options for more natural-looking results 3. Target handling: - Kornia: Image tensors only - Albumentations: Images, masks, bboxes, keypoints |
RandomFisheye | OpticalDistortion | - Similar core functionality: apply optical/fisheye distortion - Key similarities: 1. Both have default probability p=0.5 2. Both support fisheye distortion - Key differences: 1. Parameter specification: - Kornia: * Separate center_x , center_y for distortion center* gamma for distortion strength- Albumentations: * Single distort_limit parameter* mode selection ('camera' or 'fisheye')2. Distortion models: - Kornia: Fisheye only - Albumentations: * Camera matrix model * Fisheye model 3. Additional features in Albumentations: * Separate interpolation methods for image and mask * Support for masks, bboxes, keypoints 4. Target handling: - Kornia: Image tensors only - Albumentations: Images, masks, bboxes, keypoints |
RandomHorizontalFlip | HorizontalFlip | - Similar core functionality: flip image horizontally - Key similarities: 1. Both have default probability p=0.5 2. Simple operation with same visual result - Key differences: 1. Batch handling: - Kornia: * Additional p_batch parameter* same_on_batch option- Albumentations: No batch-specific parameters 2. Target handling: - Kornia: Image tensors only - Albumentations: Images, masks, bboxes, keypoints |
RandomPerspective | Perspective | - Similar core functionality: apply perspective transformation - Key similarities: 1. Both have default probability p=0.5 2. Both transform image by moving corners 3. Both support different interpolation methods: - Kornia: via resample (BILINEAR, NEAREST)- Albumentations: via interpolation (INTER_LINEAR, INTER_NEAREST, etc.)- Key differences: 1. Distortion control: - Kornia: * distortion_scale (0 to 1, default: 0.5)* sampling_method ('basic' or 'area_preserving')- Albumentations: * scale tuple for corner movement range* fit_output option for image capture2. Output handling: - Kornia: * align_corners parameter* keepdim for batch form- Albumentations: * keep_size for output dimensions* Border mode and fill options * Separate mask interpolation 3. Target handling: - Kornia: Image tensors only - Albumentations: Images, masks, keypoints, bboxes |
RandomResizedCrop | RandomResizedCrop | - Similar core functionality: crop random patches and resize - Key similarities: 1. Both have default probability p=1.0 2. Same default parameters: * scale (0.08, 1.0)* ratio (~0.75, ~1.33)3. Both support different interpolation methods: - Kornia: via resample - Albumentations: via interpolation - Key differences: 1. Implementation options: - Kornia: * cropping_mode ('slice' or 'resample')* align_corners parameter* keepdim for batch form- Albumentations: * Separate mask interpolation * Fallback to center crop after 10 attempts 2. Target handling: - Kornia: Image tensors only - Albumentations: Images, masks, bboxes, keypoints |
RandomRotation90 | RandomRotate90 | - Similar core functionality: rotate image by 90 degrees - Key similarities: 1. Both have default probability p=0.5 2. Both rotate in 90-degree increments - Key differences: 1. Rotation control: - Kornia: * times parameter to specify range of rotations* resample and align_corners for interpolation- Albumentations: * Simpler implementation (0-3 rotations) 2. Target handling: - Kornia: Image tensors only - Albumentations: Images, masks, bboxes, keypoints |
RandomRotation | Rotate | - Similar core functionality: rotate image by random angle - Key similarities: 1. Both have default probability p=0.5 2. Both support different interpolation methods - Key differences: 1. Angle specification: - Kornia: degrees parameter (if single value, range is (-degrees, +degrees))- Albumentations: limit parameter (default: (-90, 90))2. Additional features: - Kornia: * align_corners for interpolation- Albumentations: * Border mode options * Fill values for padding * rotate_method for bboxes* crop_border option* Separate mask interpolation 3. Target handling: - Kornia: Image tensors only - Albumentations: Images, masks, bboxes, keypoints |
RandomShear | Affine (shear parameter) | - Similar core functionality: apply shear transformation - Key similarities: 1. Both have default probability p=0.5 2. Both support different interpolation methods 3. Both support independent x/y shear control - Key differences: 1. Parameter specification: - Kornia: * Dedicated shear transform * shear parameter supports float, tuple(2), or tuple(4)* Simple padding modes (zeros, border, reflection) - Albumentations: * Part of general Affine transform * shear supports number, tuple, or dict format* More border modes and fill options 2. Additional features in Albumentations: * Separate mask interpolation * fit_output option* Combined with other affine transforms 3. Target handling: - Kornia: Image tensors only - Albumentations: Images, masks, keypoints, bboxes |
RandomThinPlateSpline | ThinPlateSpline | - Similar core functionality: apply smooth, non-rigid deformations - Key similarities: 1. Both have default probability p=0.5 2. Both use thin plate spline algorithm 3. Both support interpolation options - Key differences: 1. Deformation control: - Kornia: * Single scale parameter (default: 0.2)* Fixed control point grid - Albumentations: * scale_range tuple for range of deformation* Configurable num_control_points 2. Implementation details: - Kornia: * align_corners parameter* Binary mode choice (bilinear/nearest) - Albumentations: * OpenCV interpolation flags * More granular control over grid 3. Target handling: - Kornia: Image tensors only - Albumentations: Images, masks, keypoints, bboxes |
RandomVerticalFlip | VerticalFlip | - Similar core functionality: flip image vertically - Key similarities: 1. Both have default probability p=0.5 2. Simple operation with same visual result - Key differences: 1. Implementation: - Kornia: * Additional p_batch parameter- Albumentations: * Simpler implementation 2. Target handling: - Kornia: Image tensors only - Albumentations: Images, masks, bboxes, keypoints |
Key Differences¶
Compared to TorchVision¶
- Albumentations operates on numpy arrays instead of PyTorch tensors
- Albumentations typically provides more parameters for fine-tuning transformations
- Most Albumentations transforms support both image and mask augmentation
- Better support for bounding box and keypoint augmentation
Compared to Kornia¶
- Kornia operates directly on GPU tensors, while Albumentations works with numpy arrays
- Albumentations provides more comprehensive support for object detection and segmentation tasks
- Albumentations typically offers better performance for CPU-based augmentations
Performance Comparison¶
According to benchmarking results, Albumentations generally offers superior CPU performance compared to TorchVision and Kornia for most transforms. Here are some key highlights: Common Transforms Performance (images/second, higher is better)
Transform | Albumentations | TorchVision | Kornia | Notes |
---|---|---|---|---|
HorizontalFlip | 8,618 | 914 | 390 | Albumentations is ~9x faster than TorchVision, ~22x faster than Kornia |
VerticalFlip | 22,847 | 3,198 | 1,212 | Albumentations is ~7x faster than TorchVision, ~19x faster than Kornia |
RandomResizedCrop | 2,828 | 511 | 287 | Albumentations is ~5.5x faster than TorchVision, ~10x faster than Kornia |
Normalize | 1,196 | 519 | 626 | Albumentations is ~2x faster than both |
ColorJitter | 628 | 46 | 55 | Albumentations is ~13x faster than both |
Key Performance Insights:¶
- Basic Operations: Albumentations excels at basic transforms like flips and crops, often being 5-20x faster than alternatives
- Complex Operations: For more complex transforms like elastic deformation, the performance gap narrows
- Memory Efficiency: Working with numpy arrays (Albumentations) is generally more memory efficient than tensor operations (Kornia/TorchVision) on CPU
When to Choose Each Library:¶
- Albumentations: Best choice for CPU-based preprocessing pipelines and when maximum performance is needed
- Kornia: Consider when doing augmentation on GPU with existing PyTorch tensors
- TorchVision: Good choice when deeply integrated into PyTorch ecosystem and GPU performance isn't critical
Note: Benchmarks performed on macOS-15.0.1-arm64 with Python 3.12.7. Your results may vary based on hardware and setup.
Code Examples¶
TorchVision to Albumentations¶
Python
# TorchVision
transforms = T.Compose([
T.RandomHorizontalFlip(p=0.5),
T.RandomRotation(10),
T.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
# Albumentations equivalent
transforms = A.Compose([
A.HorizontalFlip(p=0.5),
A.Rotate(limit=10),
A.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
Kornia to Albumentations¶
Python
# Kornia
transforms = K.AugmentationSequential(
K.RandomHorizontalFlip(p=0.5),
K.RandomRotation(degrees=10),
K.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
)
# Albumentations equivalent
transforms = A.Compose([
A.HorizontalFlip(p=0.5),
A.Rotate(limit=10),
A.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])