Albumentations to Torchvision Transform Mapping
On this page
- Migration Rules
- Geometric Transforms
- Color and Pixel Transforms
- Weather, Illumination, and Dropout
- Additional Albumentations-Only Coverage
- Channel Notes
- Migration Example
- Benchmark Pages
This page maps Albumentations transforms to torchvision equivalents where torchvision has a direct benchmark implementation. The benchmark source of truth is benchmark/transforms/torchvision_impl.py, with shared canonical parameters in benchmark/transforms/specs.py.
If the torchvision column is -, the benchmark has no direct torchvision v2 implementation for that Albumentations transform. That does not always mean the operation is impossible in PyTorch; it means torchvision does not expose it as the same tested augmentation primitive in the benchmark.
Migration Rules
- Albumentations receives NumPy arrays in
H,W,Clayout. Torchvision usually receives PIL images orC,H,Wtensors. - Albumentations uses transform names plus
p; torchvision often encodes randomness in class names such asRandomHorizontalFlip. - Albumentations can update masks, boxes, keypoints, rotated boxes, volumes, and videos through
A.Compose; torchvision image transforms alone do not give you the same pipeline contract. - Torchvision color transforms often follow PIL semantics. Albumentations often uses OpenCV or explicit dtype-aware behavior, so tiny pixel differences are expected.
Geometric Transforms
| Albumentations transform | Torchvision transform | Notes |
|---|---|---|
| Resize | Resize | Direct resize benchmark mapping. Albumentations also propagates masks and other targets. |
| RandomCrop | RandomCrop | Benchmark uses a 224x224 random crop. Albumentations can update boxes and keypoints. |
| RandomResizedCrop | RandomResizedCrop | Similar crop-and-resize policy; Albumentations adds mask interpolation and target propagation. |
| CenterCrop | CenterCrop | Benchmark uses a 224x224 center crop. |
| HorizontalFlip | RandomHorizontalFlip | Same image operation when p=1; Albumentations also updates supported targets. |
| VerticalFlip | RandomVerticalFlip | Same image operation when p=1; Albumentations also updates supported targets. |
| Pad | Pad | Direct padding benchmark mapping. |
| Rotate | RandomRotation | Albumentations exposes bbox rotation method, crop behavior, border mode, image fill, and mask fill. |
| Affine | RandomAffine | Torchvision translation is fractional; Albumentations supports percent and pixel translation plus richer axis-specific parameter formats. |
| Perspective | RandomPerspective | Torchvision uses distortion_scale; Albumentations uses a corner movement scale range and has keep_size / fit_output. |
| ElasticTransform | ElasticTransform | Similar concept; parameter meanings and defaults differ. Verify visually when migrating. |
| RandomRotate90 | - | No direct torchvision v2 benchmark implementation. Use Albumentations when 90-degree rotations are part of the augmentation policy. |
| SquareSymmetry | - | No direct torchvision benchmark implementation for the full 8-way square symmetry group. |
| Transpose | - | No direct torchvision benchmark implementation. |
| SafeRotate | - | No direct torchvision benchmark implementation for rotation that keeps the whole image content. |
| RandomScale | - | No direct torchvision benchmark implementation. |
| ShiftScaleRotate | - | Approximate with affine logic if needed, but the benchmark has no direct torchvision primitive for the combined transform. |
| GridDistortion | - | No direct torchvision benchmark implementation for grid-based non-rigid distortion. |
| ThinPlateSpline | - | No direct torchvision benchmark implementation for smooth control-point deformation. |
| OpticalDistortion | - | No direct torchvision benchmark implementation for lens/fisheye distortion. |
| RandomGridShuffle | - | No direct torchvision benchmark implementation. |
| Morphological | - | No direct torchvision benchmark implementation for morphology operations. |
| PadIfNeeded | - | No direct torchvision benchmark implementation for padding to minimum size constraints. |
| CropAndPad | - | No direct torchvision benchmark implementation. |
| RandomSizedCrop | - | No direct torchvision benchmark implementation. |
| LongestMaxSize | - | No direct torchvision benchmark implementation for this named aspect-ratio primitive. |
| SmallestMaxSize | - | No direct torchvision benchmark implementation for this named aspect-ratio primitive. |
Color and Pixel Transforms
| Albumentations transform | Torchvision transform | Notes |
|---|---|---|
| ColorJitter | ColorJitter | Both adjust brightness, contrast, saturation, and hue. Color-space details differ. RGB-specific. |
| ChannelShuffle | RandomChannelPermutation | Same idea. For arbitrary-channel data, verify the operation is semantically valid for your channels. |
| ToGray | Grayscale | Torchvision outputs 1 or 3 channels. Albumentations has configurable output channels and multiple grayscale methods. |
| GaussianBlur | GaussianBlur | Albumentations can sample kernel and sigma ranges. |
| InvertImg | RandomInvert | Albumentations determines maximum value from dtype. |
| Posterize | RandomPosterize | Albumentations supports per-channel and ranged bit settings. |
| Solarize | RandomSolarize | Albumentations uses normalized threshold ranges and dtype-aware scaling. |
| Sharpen | RandomAdjustSharpness | Similar goal, different parameterization and implementation methods. |
| AutoContrast | RandomAutocontrast | Close image-only equivalent. |
| Equalize | RandomEqualize | Albumentations adds algorithm choice and optional masks. |
| Normalize | Normalize | Albumentations accepts uint8 and handles scaling through max_pixel_value; torchvision usually works on tensors. |
| Erasing | RandomErasing | Albumentations adds mask fill and inpainting options. |
| ImageCompression | JPEG | Albumentations supports JPEG and WebP compression simulation. |
| RandomBrightnessContrast | ColorJitter | Benchmark maps standalone brightness and contrast specs to torchvision ColorJitter. Albumentations commonly keeps them in one transform. |
| RandomOrder + ColorJitter + ChannelShuffle | RandomPhotometricDistort | Recreate the SSD-style policy as a composition when needed. |
| ToRGB | - | No direct torchvision benchmark implementation. Use only when converting grayscale-like inputs to RGB is intentional. |
| GaussNoise | - | No direct torchvision benchmark implementation for sampled Gaussian noise. |
| RandomGamma | - | No direct torchvision benchmark implementation. |
| PlanckianJitter | - | No direct torchvision benchmark implementation for physics-based color temperature jitter. |
| MedianBlur | - | No direct torchvision benchmark implementation. |
| Blur | - | No direct torchvision benchmark implementation for box blur in this benchmark. |
| MotionBlur | - | No direct torchvision benchmark implementation. |
| CLAHE | - | No direct torchvision benchmark implementation. |
| HueSaturationValue | - | No direct torchvision benchmark implementation for this HSV-style transform. |
| ChannelDropout | - | No direct torchvision benchmark implementation. |
| Downscale | - | No direct torchvision benchmark implementation for downscale-upscale degradation. |
| PixelSpread | - | No direct torchvision benchmark implementation. |
| Emboss | - | No direct torchvision benchmark implementation. |
| ChromaticAberration | - | No direct torchvision benchmark implementation. |
| ISONoise | - | No direct torchvision benchmark implementation. |
| ShotNoise | - | No direct torchvision benchmark implementation. |
| MultiplicativeNoise | - | No direct torchvision benchmark implementation. |
| AdditiveNoise | - | No direct torchvision benchmark implementation. |
| RandomToneCurve | - | No direct torchvision benchmark implementation. |
| RingingOvershoot | - | No direct torchvision benchmark implementation. |
| Spatter | - | No direct torchvision benchmark implementation. |
| UnsharpMask | - | No direct torchvision benchmark implementation. |
| FancyPCA | - | No direct torchvision benchmark implementation. |
| Superpixels | - | No direct torchvision benchmark implementation. |
| ToSepia | - | No direct torchvision benchmark implementation. |
Weather, Illumination, and Dropout
| Albumentations transform | Torchvision transform | Notes |
|---|---|---|
| RandomFog | - | No direct torchvision benchmark implementation. |
| RandomRain | - | No direct torchvision benchmark implementation. |
| RandomSnow | - | No direct torchvision benchmark implementation. |
| RandomShadow | - | No direct torchvision benchmark implementation. |
| RandomSunFlare | - | No direct torchvision benchmark implementation. |
| Illumination | - | No direct torchvision benchmark implementation for linear, corner, or Gaussian illumination modes. |
| PlasmaBrightnessContrast | - | No direct torchvision benchmark implementation. |
| PlasmaShadow | - | No direct torchvision benchmark implementation. |
| RandomGravel | - | No direct torchvision benchmark implementation. |
| AtmosphericFog | - | No direct torchvision benchmark implementation. |
| Vignetting | - | No direct torchvision benchmark implementation. |
| Dithering | - | No direct torchvision benchmark implementation. |
| FilmGrain | - | No direct torchvision benchmark implementation. |
| Halftone | - | No direct torchvision benchmark implementation. |
| LensFlare | - | No direct torchvision benchmark implementation. |
| GridMask | - | No direct torchvision benchmark implementation. |
| GridDropout | - | No direct torchvision benchmark implementation. |
| PixelDropout | - | No direct torchvision benchmark implementation. |
| CoarseDropout | - | No direct torchvision benchmark implementation. Torchvision has RandomErasing, but not the same sampled multi-hole dropout primitive. |
| ConstrainedCoarseDropout | - | No direct torchvision benchmark implementation for object-constrained dropout. |
| CopyAndPaste | - | No direct torchvision benchmark implementation for annotation-aware copy-paste. |
| WaterRefraction | - | No direct torchvision benchmark implementation. |
Additional Albumentations-Only Coverage
These transforms exist in Albumentations but have no direct torchvision benchmark implementation. This is the practical catalog gap: staying on torchvision means skipping these policies or implementing them yourself.
| Albumentations transform | Torchvision transform | Notes |
|---|---|---|
| AtLeastOneBBoxRandomCrop | - | Bbox-aware crop that preserves at least one object. |
| BBoxSafeRandomCrop | - | Bbox-safe crop policy for detection datasets. |
| RandomSizedBBoxSafeCrop | - | Sampled crop-and-resize with bbox safety rules. |
| CropNonEmptyMaskIfExists | - | Segmentation crop that prefers non-empty mask regions. |
| RandomCropNearBBox | - | Crop sampled near a selected bounding box. |
| RandomCropFromBorders | - | Random border crop policy. |
| LetterBox | - | Resize-and-pad policy common in detection pipelines. |
| MaskDropout | - | Drop objects using mask regions and keep targets aligned. |
| Mosaic | - | Multi-image detection augmentation with target handling. |
| AdvancedBlur | - | Sampled blur kernel beyond standard Gaussian blur. |
| Defocus | - | Camera defocus simulation. |
| GlassBlur | - | Glass-like local distortion blur. |
| ZoomBlur | - | Zoom-motion blur simulation. |
| RGBShift | - | RGB channel shift augmentation. |
| SaltAndPepper | - | Salt-and-pepper corruption. |
| PhotoMetricDistort | - | Photometric distortion policy commonly used in detection. |
| FDA | - | Fourier domain adaptation. |
| HistogramMatching | - | Match image histogram to reference images. |
| PixelDistributionAdaptation | - | Pixel distribution adaptation to reference images. |
| HEStain | - | H&E stain augmentation for histopathology. |
| FrequencyMasking | - | Frequency-domain masking for spectrogram-like inputs. |
| TimeMasking | - | Time-axis masking for spectrogram-like inputs. |
| TimeReverse | - | Reverse temporal axis. |
| CenterCrop3D | - | 3D center crop for volumetric data. |
| RandomCrop3D | - | 3D random crop for volumetric data. |
| CoarseDropout3D | - | 3D coarse dropout for volumes. |
| CubicSymmetry | - | Cubic symmetry transform for 3D data. |
| GridShuffle3D | - | Grid shuffle augmentation for 3D data. |
| Pad3D | - | 3D padding. |
| PadIfNeeded3D | - | 3D minimum-size padding. |
Channel Notes
Albumentations is a safer default for non-RGB image tensors when the transform is channel-agnostic: resize, crop, flip, affine, perspective, blur, noise, dropout, and normalization can preserve the channel dimension. Be careful with RGB/color-space transforms such as ColorJitter, HueSaturationValue, ToGray, ToRGB, RandomRain, and RandomSnow.
Do not map a 9-channel multispectral tensor through an RGB transform unless that conversion is part of your data design.
Migration Example
import albumentations as A
pipeline = A.Compose([
A.RandomResizedCrop(size=(224, 224), scale=(0.08, 1.0), ratio=(0.75, 4 / 3), p=1.0),
A.HorizontalFlip(p=0.5),
A.ColorJitter(brightness_range=(0.5, 1.5), contrast_range=(0, 2.5), saturation_range=(0, 2.5), hue_range=(-0.5, 0.5), p=0.5),
A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
])
For detection:
pipeline = A.Compose(
[
A.RandomResizedCrop(size=(512, 512), scale=(0.8, 1.0), p=1.0),
A.HorizontalFlip(p=0.5),
],
bbox_params=A.BboxParams(coord_format="coco", label_fields=["labels"]),
)